Editor’s take: As a long-time electronic musician (and the former editor of Electronic Musician and Music Technology magazines), I’ve always been enamored with musical synthesizers. Leveraging a specialized set of circuits, these instruments are designed to generate an enormous array of intriguing sounds from relatively basic raw sonic material. In several ways, today’s rapidly growing crop of generative AI tools bear some interesting resemblances to them in that they can synthesize very impressive content from combinations of simple word-like “tokens” (albeit billions of them!). Generative AI tools are, in a very real sense, content synthesizers.
The latest entry to the content synthesis fray comes from Google, which is bringing an impressive array of new capabilities to the market via updates to Google Cloud and its Google Workspace productivity suite (Workspace, previously known as G Suite consists of Gmail, Google Calendar, Google Drive, Google Docs and Google Meet).
After letting Microsoft take much of the attention over the last few weeks with its OpenAI ChatGPT partnership — to the point where articles questioning Google’s ambitions for generative AI even began to appear — it is clear that the company long perceived as being an AI leader has not been resting on its laurels. Today’s debut offers a comprehensive set of applications, services, and interesting new approaches that make it clear that Google has no intention of ceding the generative AI market to anyone.
The company unveiled several new capabilities for Google Cloud, a new Generative AI App Builder for professional developers, upcoming capabilities for all the productivity apps in Google Workspace, the Maker Suite for less experienced “citizen developers,” a new PaLM large language model (LLM), and the ability to integrate third party applications and LLMs into its collection of offerings.
Frankly, it’s an overwhelming amount of information to take in at a single setting, but it proves, if nothing else, that a lot of people at Google have been working on these for a long time.
Not all of the capabilities will be available immediately though. Google laid out a vision of some things it has now and shared where it’s headed in the future, but in the incredibly dynamic market that is generative AI, the company clearly felt compelled to make a statement.
Some of the most interesting aspects of the Google vision for generative AI are around openness and the ability to collaborate with other companies. For example, Google talked about the idea of a foundation model “zoo” where different LLMs could be plugged into different applications. So, for example, while you could certainly use Google’s newly upgraded PaLM (Pathways Language Model) text or PaLM chat models in enterprise applications via API calls, you could also use other 3rd party or even open source LLMs in their place.
The degree of flexibility was impressive with different LLMs, though I also couldn’t help but think that corporate IT departments could quickly start getting overwhelmed by the range of choices that would be available. Given the inevitable demands for testing and compliance, there might be some value in limiting the number of options that organizations can use (at least initially).
Google made a big point of emphasizing that organizations could integrate their own data on top of Google’s LLMs to make them customized to the unique needs of an organization. For example, companies could ingest some of their own original content, images, styles, etc., into an existing LLM, and that custom model could then be used as the core generative AI engine for an organization’s content synthesis applications. These customizations could prove to be particularly appealing to many organizations.
There were also a lot of announcements about partnerships that Google has with a variety of different vendors, from little-known AI startups like AI21Labs and Osmo to quickly rising developers, such as code generation toolmaker Replit or LLM developers Anthropic and Cohere. On the side of generative images, they highlighted work with Midjourney, which not only allows initial creation of images via text descriptions, but text-based edits and refinements as well.
Google also made a point of emphasizing the customizability within existing models. The company showed how individuals could adjust different model parameter settings as part of their initial query to set the level of accuracy, creativity, and more that they could expect from the output. Unfortunately, in classic Google style, very engineering-specific terms were used for some of these parameters making it unclear whether regular users will actually be able to make sense of them. However, the concept behind it is great, and thankfully, parameter wording can be edited.
Admittedly, other generative AI tools have shown these kinds of capabilities, but the UI and overall experience model that Google showed looked very intuitive.
Some of the most interesting content demos that Google illustrated for Workspace involved the ability to edit existing content (say, from a more formal written tone to a more casual one) or extrapolate from a relatively limited input prompt. Admittedly, other generative AI tools have shown these kinds of capabilities already, but the UI and overall experience model that Google showed looked very intuitive.
Among the key AI features coming to Workspace, Google highlights:
- draft, reply, summarize, and prioritize your Gmail
- brainstorm, proofread, write, and rewrite in Docs
- bring your creative vision to life with auto-generated images, audio, and video in Slides
- go from raw data to insights and analysis via auto completion, formula generation, and contextual categorization in Sheets
- generate new backgrounds and capture notes in Meet
- enable workflows for getting things done in Chat
In addition to software, Google touched upon the hardware side of the Google Cloud infrastructure that’s able to support all these efforts for both Vertex AI and Workspace. The company noted how many of these workloads are powered by various combinations of their own TPUs as well as Nvidia’s powerful GPUs. While much of the focus on generative AI applications has only been on the software, there’s little doubt that hardware innovations in the semiconductor and server space will continue to have a large impact on AI developments.
Returning to the synthesizer analogy, the advancements in LLMs that Google’s new offerings highlight in many ways reflect the diversity of different sound engines and architectures used to design them. Just as there are many types of synthesizers, with the primary differences coming from the raw source material used in the sound engine and the signal flow through which they proceed, so too do I expect to see more variety in foundational LLMs. There will likely be a diversity of source materials used for various models and different architectures through which will they’ll be processed. Similarly, the degree of “programmability” will likely vary quite a bit as well, from a modest number of preset options to the complete (but potentially overwhelming) flexibility of modularity — just as is found in the world of synthesizers.
In terms of availability, many of Google’s new capabilities are initially limited to a set of trusted testers, and pricing (and even purchase options) for these services are still unannounced.
For regular users, some of the text-based content generation tools in Docs and Gmail will likely be the first taste of Google-driven generative AI that many are likely to experience. And like Microsoft, future iterations and enhancements will undoubtedly come at a very rapid pace.
There is little doubt that we’ve entered an enormously exciting and competitive new era in enterprise computing and the overall tech world. Generative AI tools have sparked a mind-blowing range of potential new applications and productivity enhancements that we’re really just starting to get our minds around. As with many big tech trends, overhype is inevitable. However, it’s also clear Google has now firmly placed a stake in the ground of the rapidly evolving world of generative AI tools and services. What happens next isn’t clear, but it’s going to be incredibly exciting to watch.
Bob O’Donnell is the founder and chief analyst of TECHnalysis Research, LLC a technology consulting firm that provides strategic consulting and market research services to the technology industry and professional financial community. You can follow him on Twitter @bobodtech.