The Good Tech Companies - Help, My Prompt is Not Working!

Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. Help, my prompt is not working, by Andrew Prosikin. Despite your best efforts, the LLM still isn't behaving as expected. What should you try next? Do you edit the prompt? Change the model? Fine-tune? Any of the secan be valid options and there is an order in which to try these fixes. Principle V. Follow prompt fix escalation ladder. This is part of an ongoing Principles of AI Engineering series. See posts 1, 2, 3 and 4. When a prompt doesn't work as expected,

Starting point is 00:00:35 I try the following fixes in order of preference. 1. Expanding and rephrasing instructions. 2. Adding examples. 3. Adding an explanation field, 4. Using a different model, 5. Breaking up a single prompt into multiple prompts, 6. Fine-tuning the model, 7. Throwing the laptop out the window in frustration, backslash dot. In some cases, the order of things to try will be different. Nonetheless, having a default path saves time and preserves mental capacity for debugging.

Starting point is 00:01:06 The list isn't designed as a rigid ladder, but as a guide rope intended to keep you moving forward. Now let's skim over each approach. The first three fall into the bucket of prompt engineering and will be covered in more depth in the next chapter. Multi-prompt and fine-tuning approaches will each have dedicated chapters, lightweight approaches. Adding instructions first thing to try is re-explaining to the LLM what to do via prompt instructions. Try adding clearer directions, rephrasing, or moving instructions around. Don't hesitate to repeat or reformulate statements multiple times in different parts of the prompt. LLMs don't get annoyed by repetition. For particularly important directives, add them at the beginning or end of the prompt for maximum effect.

Starting point is 00:01:48 1, 2. Adding EXAMPLESLLMs respond very well to in-context learning, input-output examples. They are particularly important if you are utilizing smaller models. These are not as naturally intelligent, so require lots of guidance. 3. Example of a prompt with two-shot inference, language detection. Typically you would use one to three examples, though in some cases you could add more. There is evidence that performance improves with a higher number of examples, four, but so does the maintenance and execution cost.

Starting point is 00:02:21 Adding an explanation F I E L D L L M s, like humans, benefit from having to explain their thinking. Add an explanation field to your output JSON and the output will usually get better. This will also help you identify why the model is making certain decisions and adjust instructions and examples. In cases where the prompt uses internal documentation, ask the LLM to output a sections of documentation it used to construct answers. This reduces hallucinations. 5. You can also attempt to use a chain of thought reasoning prompt. The challenge here will be properly extracting the output. You may need a second prompt or additional code to process responses with caught reasoning. Changing the

Starting point is 00:03:02 model different models excel at different types of tasks. OpenAI's 03 model excels at analyzing code, but good old 4.0 tends to produce better writing despite being cheaper per token. Part of the job of an AI engineer is keeping up with the strengths and weaknesses of available models as they are released and updated. Frequently try out different models for the same task. This experimentation works way faster and safer when you have automated tests and metrics to measure each model's fitness for the task. Heavyweight Approaches Every approach until now has been relatively low cost to try. Now we are getting into the heavyweight fixes.

Starting point is 00:03:38 Breaking up the prompt if one prompt can't get the job done. Why not try a system of two or more prompts? This can work effectively in some cases. Why not try a system of two or more prompts? This can work effectively in some cases. The two common approaches are splitting the prompt by area of responsibility. Using a new prompt as a guardrail reviewing output of the previous one. Both approaches are introduced in part three and will be discussed in a subsequent chapter in more detail. Fine tuning fineuning is an even heavier approach than using multiple prompts.

Starting point is 00:04:08 For most problems, I use it as a last resort. Why am I hesitant to recommend fine-tuning in most cases? Fine-tuning is fundamentally a machine learning approach applied to generative AI. As such, IT requires collection of massive amounts of data and a whole set of ML tools in addition to generative AI ones and this is a massive overhead to manage force mall to mid-sized projects. It also diminishes core AI benefits, explainability and ease of editing logic. Consider fine-tuning when other techniques failed to achieve the objective.

Starting point is 00:04:40 The problem is highly complex and specialized and default LLM knowledge is insufficient. You have a high volume use case and want to save money by using a lower-end model. Low latency is needed, so multiple prompts cannot be executed in sequence. Conclusion Hopefully, this article clarifies the order of steps you should take when prompts don't work as intended. First, you would typically try a prompt engineering approach. If that doesn't work, attempt to switch the model and see if that helped.

Starting point is 00:05:10 Next step is utilizing multiple interacting prompts. Finally, consider fine-tuning if all other methods have failed. If you've enjoyed this post, subscribe for more. Thank you for listening to this Hacker Noon story, read by Artificial Intelligence. Visit HackerNoon.com to read, write, learn and publish.

The Good Tech Companies - Help, My Prompt is Not Working!

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.