
We stand on the frontier of the AI revolution. Over the past decade, deep learning has emerged from the seismic collision of data availability and massive computing power, enabling a number of impressive AI capabilities. But we face a paradoxical challenge: automation is labor intensive. It sounds like a joke, but it’s not, as anyone who has tried to solve business problems with AI knows.
Traditional AI tools, while powerful, can be expensive, time-consuming and difficult to use. Data must be painstakingly collected, curated, and labeled with task-specific annotations to train AI models. Building a model requires special, hard-to-find skills—and each new task requires repeating the process. As a result, businesses are primarily focused on automating tasks with abundant data and high business value, leaving everything on the table. But this is starting to change.
of Emergence Transformers and self-directed learning methods have allowed us to use a huge number of them Unnamed data, paving the way for large pre-trained models, sometimes “Base modelsHe said. These large models have reduced the cost and labor in automation.
Foundational models provide a powerful and versatile foundation for various AI applications. We can use base models to perform tasks quickly with limited information and minimal effort; In some cases, we only need to specify the work that helps solve the model.
But these powerful technologies introduce new risks and challenges for enterprises. Many of today’s models are trained on poor quality data sets, leading to biased, biased or inaccurate responses. The larger models are expensive, labor intensive to train and run, and complex to deploy.
We at IBM are creating an approach that solves the main challenges of leveraging foundational models. for organization. Today, we announced that watsonx.ai, IBM is your gateway to the latest AI tools and technologies on the market today. As a testament to how fast the field is moving, some tools are only weeks old and we’re adding new ones as I write.
What is included in watsonx.ai – a large part of IBM Watsons This Week’s Featured Offerings – They’re diverse, and will continue to evolve, but our overall commitment is the same: to deliver secure, enterprise-ready automation products.
It’s our ongoing work at IBM to accelerate our customers’ journeys to find value in AI from this new paradigm. Here, I describe our work to build enterprise-class, IBM-trained base models, including our data approach and model architecture. I’ll also describe the new platform and tools that enable enterprises to build and deploy Foundation model-based solutions using a wide catalog of open source models in addition to our own.
Information: The basis of your base model
Data quality issues. An AI model trained on biased or toxic data will naturally tend to produce biased or toxic results. This problem is compounded in the era of foundational models, where the data used to train models comes from so many sources and is so vast that no one can rationally process it all.
Because data is the fuel that drives foundational models, we at IBM focus on carefully designing everything that goes into our models. We have developed AI tools to filter our data for hate and obscenity, licensing restrictions and bias. When objectionable data is identified, we remove it, retrain the model, and iterate.
Data recovery is truly an unfinished task. We continue to develop and refine new methods to improve data quality and control, to meet evolving legal and regulatory requirements. We built an end-to-end framework to keep track of the filtered raw data, the methods used, and the models each data point affected.
We continue to collect high-quality data to help address some of the most pressing business challenges in diverse domains such as finance, law, cyber security and sustainability. We are currently targeting over 1 terabyte of transcripts and adding collected software code, satellite data, and IT network event data and logs to train our base models.
IBM research is also developing techniques to improve confidence, reduce bias, and improve model safety throughout the base model lifecycle. This includes our work in this area FairIJIdentifying and editing biased data points in the data used to calibrate the model. Other methods, such as Equity reprogrammingAllow us to reduce the bias in the model even after training.
Effective foundation models focused on corporate value
IBM’s new watsonx.ai studio provides foundational models aimed at delivering enterprise value. They are included in various IBM products that will be available to IBM customers in the coming months.
Recognizing that one size does not fit all, we are building a family of language and code foundation models of varying sizes and architectures. Each model family has a geology-themed codename—Granite, Sandstone, Obsidian, and Slate—that brings together top innovations from IBM Research and the open research community. Each model can be customized for different organizational functions.
Ours Granite Models are based on a decoder-only, GPT-like architecture for generative functions. Sandstone Models use encoder-decoder architecture and are suitable for fine-tuning in specific tasks, they are interchangeable with Google’s popular T5 models. Obsidian Models use a new modular architecture developed by IBM Research, which provides high levels of comprehensibility and performance across a wide range of tasks. Slate Although not generative, it refers to a family of RoBERTa-based models that are fast and efficient for many enterprise NLP tasks. All watsonx.ai models are trained on an IBM-produced, enterprise-focused data lake, on a custom-designed cloud-native AI supercomputer. Vela.
Efficiency and sustainability are key design principles for watsonx.ai. At IBM Research, we have developed new technologies for efficient model training, including our “LigoAn algorithm that “reuses small models and ‘grows them to bigger ones’.” This method can save 40% to 70% of the time, cost, and carbon output needed to train a model. To improve inference speeds, we use our deep knowledge of size or models from 32-points. We are shrinking floating-point arithmetic to very small integer bit formats. Reducing the precision of an AI model yields significant efficiency gains without sacrificing accuracy. We hope to soon run these compressed models on our AI-optimized chip, IBM AIU.
Hybrid cloud tools for base models
The final piece of the foundation model puzzle is creating an easy-to-use software platform for configuring and deploying models. IBM’s hybrid, cloud-native A stack of reasoning, built on RedHat OpenShift, is optimized for training and serving base models. Enterprises can leverage OpenShift’s flexibility to run models from anywhere, including on-premises.
At watsonx.ai, we have created foundational model-driven solutions that provide customers with a user-friendly user interface and developer-friendly libraries. Our Quick Labs allow users to quickly perform AI tasks with a few coded examples. Tuning Studio enables fast and robust model tuning using your own data based on modern efficient optimization techniques. Developed by IBM Research.
In addition to IBM’s own models, watsonx.ai provides seamless access to a wide range of open source models for enterprises to test and iterate quickly. In a new partnership with Hugging Face, IBM offers thousands of open source Hugging Face Foundation models, datasets, and libraries in Watsons.I. Hugging Face, on the other hand, offers all of IBM’s proprietary and open access models and tools Watsons.No.
To try a new model, simply select it from the drop-down menu. You can learn more about the studio here.
Looking ahead
Foundational models are changing the landscape of AI, and progress has only increased in recent years. We at IBM are excited to help shape the boundaries of this rapidly growing field and translate innovation into real enterprise value.
Learn more about watsonx.ai
We offer you some site tools and assistance to get the best result in daily life by taking advantage of simple experiences