ChatGPT — OpenAI’s New Dialogue Model!!

Mandar Karhade, MD. PhD.
Towards AI
Published in
8 min readDec 1, 2022

--

OpenAI also released ChatGPT. This model was trained to have interactions in a conversational way.
Credits: https://unsplash.com/@etienneblg

OpenAI released the GPT-3.5 series ‘davinci-003’, large-language models (LLM), on Monday. These models were built using reinforcement learning with human feedback (RLHF) design. This model builds on InstructGPT. RLHF was a step in the right direction from 002, which uses supervised fine-tuning on human-written text. Read about it in my previous article.

On Wednesday, OpenAI also released ChatGPT. This model was trained to have interactions in a conversational way. According to the description on OpenAI, ChatGPT is a sibling of InstructGPT, which is trained to follow instructions in a prompt and provide a detailed response.

This is the next step in the iterative development of LLMs at OpenAI. With each release, OpenAI is reaching closer and closer to the rumored GPT-4 models. With every iteration, many lessons are learned, whether these are text, codex, InstructGPT, or ChatGPT models. Both performance and the safety of these models are being improved. There has been a substantial reduction in the harmful and fictitious outputs achieved by the use of the human-moderated RLHF step.

To read more about GPT-4, check out my previous article.

Methods for training

As described earlier, the model was trained using RLHF (the same method as InstructGPT). The difference is in how the data was set up for training (and also collected). The initial model was trained using supervised fine-tuning (like davinci-002 models). The model generated responses (multiple). These responses were shared with human trainers (hence RLHF) to rank them. These ranks were used to reward or punish a reinforcement design.

Source: openai

Limitations

ChatGPT’s self-identified limitations are as follows.

  1. Plausible-sounding but incorrect answers: a) There is no real source of truth to fix this issue during the Reinforcement Learning (RL) phase of training. b) Training model to be more cautious can mistakenly decline to answer (false positive of troublesome prompts). c) Supervised training may mislead / bias the model tends to know the ideal answer rather than the model generating a random set of responses and only human reviewers selecting a good/highly-ranked response
  2. ChatGPT is sensitive to phrasing. Sometimes the model ends up with no response for a phrase, but with a slight tweak to the question/phrase, it ends up answering it correctly.
  3. Trainers prefer longer answers that might look more comprehensive, leading to a bias towards verbose responses and overuse of certain phrases.
  4. The model is not appropriately asking for clarification if the initial prompt or question is ambiguous.
  5. A safety layer to refuse inappropriate requests via Moderation API has been implemented. However, we can still expect false negative and positive responses.
Credits: OpenAI.com

Example of ChatGPT: Help me peel a banana!

User: What is the best way to peel a banana? does it have potassium?

ChatGPT: There are several ways to peel a banana, and the best method may vary depending on your personal preferences. Some people prefer to hold the banana near the stem and gently pull the skin away from the fruit, while others like to use a knife to cut off the ends of the banana and then slice down the length of the skin to open it.

As for the question of potassium, bananas are a good source of this essential mineral. Potassium is important for maintaining healthy blood pressure and heart function, as well as for supporting healthy muscle and nerve function. A medium-sized banana contains about 422 milligrams of potassium, which is about 12% of the daily recommended value. So, yes, bananas do contain potassium.

User: but does it have it in the peel?

ChatGPT: While bananas do contain potassium, most of the potassium is found in the flesh of the fruit rather than the peel. The peel of a banana is primarily made up of fibrous tissue and contains very little of the nutrients that are found in the edible part of the fruit. So, while it’s not impossible that there may be some potassium in the peel of a banana, it is not a significant source of this mineral.

Example of ChatGPT: What is bokeh and fake it!

User: How does the bokeh work? Can I fake it?

ChatGPT: Bokeh is a term used to describe the aesthetic quality of the out-of-focus areas of a photograph. It is typically characterized by soft, blurry areas that draw the viewer’s attention to the in-focus subject of the photo. The term comes from the Japanese word “boke”, which means “blur” or “haze”.

Bokeh is created by the interaction of the aperture and the lens in a camera. The aperture is an adjustable opening in the lens that controls the amount of light that enters the camera. A larger aperture (a smaller f-number) allows more light to pass through, which creates a shallower depth of field. This means that only a narrow range of distances will be in focus, while objects outside of that range will be blurred.

In general, lenses with a more rounded aperture will produce smoother, more pleasing bokeh. You can fake bokeh to some extent by using editing software to blur the out-of-focus areas of a photo, but it will not be the same as the natural bokeh created by the lens.

Example of ChatGPT: Think about thoughts!

User: How to improve the clarity of thoughts? what does it even mean?

ChatGPT: Improving the clarity of your thoughts can help you to better understand your own ideas and communicate them more effectively to others. There are several strategies you can use to improve the clarity of your thoughts, including the following:

  1. Take time to organize your thoughts before speaking or writing. This can help you to present your ideas in a clear and logical order, making it easier for others to understand.
  2. Use specific and concrete language, rather than vague or abstract terms. This can help to make your ideas more tangible and easier to understand.
  3. Avoid using complex or technical language, unless you are sure that your audience will understand it. Using simple and straightforward language can help to make your ideas more accessible.
  4. Use examples and illustrations to help clarify your ideas. This can make it easier for others to understand your points and see how they relate to real-world situations.
  5. Take the time to listen carefully to others and ask questions to clarify any points that are unclear. This can help to ensure that you and your audience are on the same page and working towards a common understanding.

Overall, improving the clarity of your thoughts can help you to communicate more effectively and to better understand your own ideas. It can also make it easier for you to learn and grow as an individual.

User: but what does it mean?

ChatGPT: To have clear thoughts means to have ideas and concepts that are well-defined and easy to understand. This can include having a clear understanding of your own ideas, as well as being able to effectively communicate those ideas to others. Having clear thoughts can also involve being able to organize your thoughts in a logical and coherent manner, and using specific and concrete language to express yourself. Overall, having clear thoughts means being able to think and communicate effectively, and to better understand the world around you.

Example of ChatGPT: Fix errors in the code!

User: this code is not working as I expect — how do I fix it?

resultWorkerErr := make(chan error)
defer close(resultWorkerErr)
go func() {
defer cancel()
resultWorkerErr <- b.resultWorker(ctx)
}()
err := b.worker(ctx)
cancel()
if err == nil {
return <-resultWorkerErr
}
return multierror.Append(err, <-resultWorkerErr)

ChatGPT: It’s difficult to say what’s wrong with the code without more context. Can you provide more information about what the code is supposed to do and what isn’t working as expected? Also, is this the entire code or just a part of it?

User: just part of it — the error never surfaces. I think there is something wrong with the channel

ChatGPT: It’s difficult to say without more information about what the code is supposed to do and what happens when it’s executed. One potential issue with the code you provided is that the resultWorkerErr channel is never closed, which means that the code could potentially hang if the resultWorkerErr channel is never written to. This could happen if b.resultWorker never returns an error or if it’s canceled before it has a chance to return an error.

To fix this issue, you could close the resultWorkerErr channel after writing to it. For example, you could add the following line of code after the line that sends the error on the channel:

close(resultworkerErr)

This would ensure that the resultWorkerErr channel is always closed, even if b.resultWorker never returns an error. Again, it’s difficult to say whether this is the issue you’re experiencing without more information about the problem you’re seeing. Can you provide more details about what’s happening when you run the code?

--

--

Physician Scientist | Healthcare AI/ML leader | Builder | Photographer | Perpetual Learner