Training data is “an extremely large dataset that is used to teach a machine learning (ML) model” (Techopedia). In this case, ChatGPT is a type of Machine Learning called a Large Language Model (LLM). According to Molly Ruby of Toward Data Science, “LLMs digest huge quantities of text data and infer relationships between words within the text.” OpenAI called this process “generative pre-training (GP).

TEACHING TIP: ChatGPT learns from human-created content. As we know, humans have biases, make mistakes, and draw wrong conclusions. Therefore, you can expect ChatGPT to have potential bias, make mistakes, and draw wrong conclusions as well.

As you study AI, understanding the data used to train the model is vital to understanding how to use it. Training data has three forms:

Training Data Form 1. Unsupervised ML Models.

In unsupervised models, the training data is not labeled. 

TEACHING TIP: In their June 2018 abstract,  Improving Language Understanding by Generative Pre-Training, Open AI scientists created a method allowing ChatGPT to learn from mass quantities of unlabeled data followed by “discriminate fine-tuning” on each task because it would provide “a significant performance boost.) Therefore, much of the Chat GPT’s data set  is not human reviewed. However, as they state in their paper, without this, ChatGPT might not exist.

Training Data Form 2. Supervised ML Models.

In this case, the training data is labeled in some way. It could be labeled “true or false” or with some other sort of data. Obviously, this could slow down the training process, but if data is labeled accurately, it can provide many benefits to the output. For example, when ChatGPT moved from its 3.0 model to the 3.5 model we are using today, Open AI hired 40 contractors to create a supervised data set. Prompts were collected from user input, and the labelers wrote appropriate responses. So, now the new GPT 3.5 model is called the SFT Model for (Supervised Fine Tuning).  That said, it is a huge dataset and much hasn’t been reviewed by humans.

TEACHING TIP: We need to know that our prompts can and likely will be read by those hired as part of Open AI’s SFT work. So, we should not put anything in a prompt we do not want to read by a stranger. This includes asking ChatGPT to revise a strategic plan, revision of confidential information of any kind, or things that include names, addresses, places, or anything that should be protected for privacy or legal reasons. This applies to any AI tool using SFT. However, it seems you need SFT to create a good model. This is the Human Intelligence (HI) that makes AI so powerful.

Training Data Form 3. Reinforcement.

In reinforcement learning, the model’s performance is evaluated and it learns based on feedback. 

TEACHING TIP: As you interact with ChatGPT you can give it feedback on its response. As you do, you may even see it admit mistakes, converse with you, or seemingly change “opinion.” However, the reviewers labeling the data behind the scenes may also review your prompt and the response. For this reason, you should scrutinize each site’s privacy policy using AI Chatbots for how their reinforcement learning is happening.

RESOURCE: Read more at 365 Data Science, Toward Data Science, and Open AI’s Research Brief.

Leave a Reply

Your email address will not be published. Required fields are marked *