paint-brush
Demystifying AI Adoption in Business: A Guide to Leveraging ChatGPT Technologies With Azure & Googleby@ursushoribilis
478 reads
478 reads

Demystifying AI Adoption in Business: A Guide to Leveraging ChatGPT Technologies With Azure & Google

by Miguel RodriguezJune 30th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Large Language Models like ChatGPT have arrived and many people are using them. Some companies are already banning the use of Chat GPT for work due to data ownership and security concerns. What if there is a way to adopt the technology while keeping ownership of your data? And what if there was a way of reducing the false data hallucinations created by these models?
featured image - Demystifying AI Adoption in Business: A Guide to Leveraging ChatGPT Technologies With Azure & Google
Miguel Rodriguez HackerNoon profile picture

ChatGPT is something that not only is everyone talking about, but that many folks are actually using. Some companies are already freaking out and banning the use of ChatGPT for work due to data ownership and security concerns.


But what if there is a way to adopt the technology while keeping ownership of your data? And what if there is also a way to reduce the false data hallucinations created by these models?


The Problem

Large Language Models like ChatGPT have arrived, and many people are using them. They are like a Swiss army knife for the brain. You can use them to generate ideas, verify them, and word them nicely. Yet, the usage of the tool for commercial reasons has two main issues.


  1. The first one is the possibility of the data that you submit can be used to train the system further. This would result in your information becoming indirectly publicly available.


  2. The second one is the Hallucination problem. Cha GPT is a bit like a rogue student submitting answers that might not be factual.


What if I were to tell you that there is a solution for both issues above? But before I explain, let’s check the facts first (Unlike ChatGPT :-)).


When you log in to ChatGPT, you get the following prompt:

ChatGPT Warning


It states clearly that you should not share sensitive info because the data can be used by AI trainers to improve the system.


This is basically stating that if they deem any piece of information that you submit as worthy and relevant to be part of the training set of data of Chat GPT, well, they can use it.


An interesting case is Getty Images suing Stability AI for improper use of their material. They found that the AI models from Stability AI were generating images using the watermarks that Getty uses to avoid piracy.


In a similar way, if one of your engineers uploads a very clever algorithm to the AI to ask for guidance, the training team could feed it on the next training round, and the model can “reuse” it in the future when your competitor asks for a similar type of code.


To give OpenAI credit, they do let you keep the copyright of all text (or images or source code) that is generated by the tool based on your prompts.


Yet some companies (like Samsung) have enacted policies that ban the usage of LLM technologies in the workplace. While this policy might make sense to the security and compliance departments of these companies, it has several problems. One, it can be easily circumvented.


Two it is not easy to control and enforce. And third, it will ultimately reduce the productivity gains that these organizations could gain from using such technologies.


Imagine if a caveman tribe decided to ban fire because it could burn the cave. That tribe would have no chance against a tribe that actually adopted the new fire technology in a controlled environment and used it to thrive.

The Solution

To the Data Ownership Problem

Enter Azure Open AI service based on Chat GPT and Google Generative AI modes based on PaLM. In the next battle for the dominance of the cloud platforms, Microsoft and Google are now making available LLM modes as web services.


Since these models run in the private cloud subscription, the data stays in the company and is not used to train models in the future.


Both companies provide multiple pre-trained models that are specialized for specific use cases:

  • Text models: This is the most familiar to users of ChatGPT. Microsoft provides GPT-3 and GPT-4 models that understand and answer queries in natural language. Google provides PaLM for text and PaLM for chat.


  • Software code models: These models can generate, analyze, and refactor source code in several languages. Google provides Codey while Microsoft’s model is called Codex. These models can also generate code from natural language.


  • Image models: Microsoft provides Dall-E which generates original images based on text prompts (The caption image of this article was generated by Dall-E2)


  • Embedding models. An Embedding is an information-dense representation of the semantic meaning of a piece of text. Google provides the Text Embeddin, and Microsoft the Embeddings set of models.

Azure OpenAI

Google Vertex AI


Enabling employees at a company to use these gated models would allow for a more permissive use of these technologies in the workplace.

To the Hallucinations Problem

Did you hear about the lawyer who asked Chat GPT to come up with evidence to support his case while suing an airline, just to have the case dismissed because ChatGPT had hallucinated the evidence that he used?


Large Language Models are after all statistical machines trying to stitch together words to answer a query by the user. Some facts will have a higher probability of being correct, and others’ probability would make them more bogus, yet the model always tries to answer. As Google puts it:


Model Hallucinations, Grounding, and Factuality: The PaLM API may lack grounding and factuality in real-world knowledge, physical properties, or accurate understanding. This limitation can lead to model hallucinations, which refer to instances where it may generate outputs that are plausible-sounding but factually incorrect, irrelevant, inappropriate, or nonsensical.


You can not tell ChatGPT to be more factual, but you finetune both the Microsoft and the Google models to be “more” or “less” creative by using confidence scoring or “Temperature” as Microsoft calls it.


This means you can reduce the amount of “nonsense parroting” (AKA BS) that the responses of the model will have.

Using the Tool, Reaping the Benefits, and Sharing the Love.

There are plenty of ways that an LLM can improve the productivity of employees and organizations.


  • The service desk that wants to make a large amount of documents available to the rest of the company and in multiple languages.


  • The software team that wants to use the tool to refactor and review their code before committing their changes.


  • The marketing team that wants to play with possible scenarios for an upcoming campaign.


  • The corporate communications department that wants catchy titles for the publications it produces.


And the cherry on top is that some of these models can also be connected to databases and other structured sources of data. Thus preparing reports would be much easier.

Conclusion

LLMs will have a similar effect to how the manufacturing industry increased its productivity as smarter machines and robots started populating the production lines. People will do more with less.


The automation has now reached the office. The productivity of the office workers will increase. There are now ways for organizations to use the power of Large Language models to help their office workers be more productive.


The million-dollar question is if the companies will share the productivity gains with their teams.