The Changelog: Software Development, Open Source - Dataset wars, Bark, Kent Beck needs to recalibrate, StableLM & blind prompting is not prompt engineering (News)

Starting point is 00:00:00 What up nerds, I'm Jared and this is ChangeLog News for the week of Monday, April 24th, 2023. One of my favorite things about our new email format is we no longer proxy links through email.changelog.com. That's awesome for two reasons. First, privacy. We have no idea which links we're clicking on. And two, user experience. You can hover on a link to see where you're headed first. I do that all the time. If you appreciate direct links as much as I do, pop in your email address at changelog.com slash news. And if you've already done that, please tell a friend about the show. Okay, let's get into the news. The dataset wars are heating up. The New York Times reports that Reddit will begin charging for access to its API. They appear to be following Twitter's playbook

Starting point is 00:00:57 here only with much better tactics because they won't be charging small time researchers or indie bot and app developers. It's companies like Google and OpenAI who want the data to power their machine learning projects who will have to pony up. Stack Overflow is also getting in on that action. I expect this will become increasingly common, as increasingly AI-focused product offerings require increasingly large, diverse, and high-quality data sources. At the same time, companies who are well-positioned to provide said data sources experience increasingly less advertising revenue. It actually feels like a better business model for the Reddits, Stack Overflows, and Pinterest

Starting point is 00:01:40 of the world. And as an end-user of these systems, for some reason I feel better about trading in my data to be compressed alongside millions of others and synthesized by an AI than I do having it used to profile me for personalized ads. Not that they won't do both, but still. Are you with me on that sentiment, or am I out on a limb here? Let me know in the comments. The team at Suno AI is helping change the game in text-to-speech realism by releasing Bark, a transformer-based text-to-audio model that can generate highly realistic multilingual speech as well as other audio, including music, background noise, and simple sound effects. It can also laugh, sigh, cry, and make other non-word sounds that people make.

Starting point is 00:02:29 Crazy, right? Here's an example that includes sad and sighs meta tags. My friend's bakery burned down last night. Now his business is toast. And here's one more with laughter. I don't like Pytorch, Kubernetes, or Schnitzel. And xylophones flummox me. You can still hear some digital artifacts and blips here and there,

Starting point is 00:02:58 but we're getting closer to synthesized audio that's indistinguishable from the real thing. And that's cool slash scary. In a tweet that went viral last week, extreme programming creator Kent Beck proclaimed, quote, I've been reluctant to try ChatGPT. Today, I got over that reluctance. Now I understand why I was reluctant. The value of 90% of my have, or are about to. The rules of engagement are changing in the software world. It's time to embrace, adapt, or watch your skills become He expands on that tweet in a full-on blog post where he tells the story of his aha moment. Oh, and if you're hoping for a scientific explanation of that 90-10 split

Starting point is 00:04:00 and which remaining skills got the 1000x boost, don't get your hopes up. Kent says he was just extrapolating wildly from a couple of his experiences, which is what he does. Stability AI, the team behind Stable Diffusion, released a new open source language model they're calling Stable LM. It's currently available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameters coming soon. This model is usable and adaptable for both commercial and research purposes. They're also releasing a set of researchpaca, GPT for All, Dolly, ShareGPT, and HH. Some of these we've covered on the pod, others I haven't even heard of.

Starting point is 00:04:55 I love how much the open-ish AI advancements build and feed off one another, because the rising tide lifts all boats. It is now time for some sponsored news. Thanks, Sentry! Instead of spending time writing tests with little to no visibility if the tests actually give you meaningful coverage in a given change, using Sentry's integration with CodeCov lets you see the untested code causing errors directly in the Sentry issue stack trace, which means no more time wasted trying to analyze your code base to find out where you need test coverage. Here's what Alex Nathaniel, the director of technology at Vecter, has to say about it.

Starting point is 00:05:37 Quote, With the Sentry and CodeCov integration, I no longer have to analyze our code base and spend cycles thinking about where we need test coverage. Instead, Sentry just tells me exactly where I need to focus, saving me several weeks out of my year and reducing my time spent on building test coverage by nearly 50%. Check the link in your show notes and chapter data to learn more about using Sentry with CodeCov and how to get all set up. Thanks again to Sentry for sponsoring Changelog News. Mitchell Hashimoto weighs in on prompt engineering in a long, detailed article titled Prompt Engineering vs. Blind Prompting. His

Starting point is 00:06:20 premise, quote, a lot of people who claim to be doing prompt engineering today are actually just blind prompting. Blind prompting is a term I am using to describe the method of creating prompts with a crude trial and error approach paired with minimal or no testing and a very surface level knowledge of prompting. Blind prompting is not prompt engineering. End quote. I feel so seen. Mitchell goes on to make the argument that prompt engineering is a real skill that can be developed based on real experimental methodologies. He displays this with a realistic example where he walks through the process of prompt engineering a solution to a problem that provides practical value to an application. If the way you've been using these new AI tools is aptly described by Mitchell as blind prompting, definitely give this one a read.

Starting point is 00:07:11 That's the news for now. I'll finish out this episode with some shout-outs to our newest Changelog++ supporters. Thank you to Jordan, Brian, Willoja, Liam, Jack, Richard, Aaron, Emmanuel, David, John, Richard, Matthew, Joe, Max, Carl, and Anthony. If you've never heard of Changelog++, check it out. It's our membership program where you can directly support our work, make the ads disappear, and get in on cool bonuses like extended episodes and shoutouts from me on this very program. On this week's changelog interview

Starting point is 00:07:45 episode, Adam sits down with Andrew Klein from Backblaze. They're chatting about hard drive reliability and how they manage more than 250,000 hard drives. Have a great week. Share changelog with your friends if you dig it, and we'll talk to you again real soon.

The Changelog: Software Development, Open Source - Dataset wars, Bark, Kent Beck needs to recalibrate, StableLM & blind prompting is not prompt engineering (News)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.