Month: November 2023

  • Week 45.23: AI on the brain

    Week 45.23: AI on the brain

    This week in artificial intelligence was a big one: Humane unveiled their highly-anticipated wearable, while OpenAI made strides with ChatGPT enhancements.

    The Humane Ai Pin

    A lot has already been said about the letdown that the Humane reveal was, mostly by people confused by the presentation style of the two ex-Apple employees who founded the company.

    If you’ve seen Apple events and Humane’s 10-minute launch video, you’ll note the contrast in delivery and positioning. Apple tries to couch features and designs in real-life use cases, and show authentic enthusiasm for what they do to improve customers’ lives (Steve was unmatched at this). Humane kicked off with all the warmth of a freezer aisle, missing the chance to sell us on why their AI Pin wasn’t just another tech trinket in an already cluttered drawer. They puzzlingly started with how there are three colors available and it’ll come with extra batteries you can swap out, before even saying what the thing does! The rules of storytelling are quite well established, and why they chose to ignore them is a mystery.

    A lot was also said about how two key facts in the video presentation, provided by the AI assistant so central to their product, turned out to be inaccurate. One was about the upcoming solar eclipse in 2024 (and Humane’s logo is an eclipse! How do you get this wrong?), and the other was an estimate of how much protein a handful of nuts has. It’s a stunning lack of attention to detail that this was not fact-checked in a prerecorded video.

    Personally, I have been waiting for the past five years to see what this stealth startup was going to launch, and as the rumors and leaks came out, I was extremely excited to see an alternative vision for how we interact with computers and personal technology. What they showed did not actually stray from what we knew. An intelligent computer that sees what you see, is controlled by natural language, and is able to synthesize the world’s knowledge and project it onto your hand in response to queries is amazing!

    The hardware looks good, channeling the iPhone 5’s design language to my eyes, and I’ll bet they had to pioneer new ideas in miniaturization and engineering to get it down to that size. I expected it to cost as much as an iPhone, but it’s only $699 USD, which feels astoundingly low. That’s not much more than what we used to pay for a large-storage iPod.

    The disappointment is in their strategy. By positioning it as a replacement for your phone rather than an accessory, they’ve reduced the total addressable market to a few curious early adopters and people who want to address having a tech or screen addiction. The kind who intentionally buy featurephones in 2023. I think their anti-screen stance is interesting, but it doesn’t win over the critical mass necessary to scale and challenge norms.

    The Ai Pin comes with its own phone line for messages and calls (for $24/mo), so it’s not going to be convenient to use this alongside your phone, and I would not give up my phone while this is still half-baked — I say this kindly, because even the iPhone launched half-baked in many ways. For many things that we have become accustomed to in life, there is no substitute for a high-definition Retina display capable of showing images, video, and detailed or private information when necessary.

    Do I believe that Apple can one day get Siri to the level of competence that OpenAI has? I have to hope, because the Apple Watch is probably a better place for an AI assistant to live than in a magnetically attached square on my T-shirt. In any case, Humane seem to have taken a leaf out of their old employers’ playbook, and will be releasing this first version only in the US, and so whether or not I would buy one is a moot point.

    OpenAI and GPTs

    Speaking of OpenAI, it would seem that they’re still the team to beat when it comes to foundation models. The playing field is full of open-source alternatives now, including Lee Kai-Fu’s 01.ai and their Yi-series models, but as a do-it-all company offering dependable access to dependable AI, OpenAI seems unassailable.

    They announced enhancements to their models, increasing context windows and speeds while halving prices for developers, and launched a new consumer-friendly product: customized instances of ChatGPT that work like dedicated apps, which they call “GPTs”. In effect, these are a version of Custom Instructions which were introduced earlier this year as a way to tell ChatGPT how to behave across all chats. But sometimes you’re a researcher at work and sometimes you want to have some dumb fun, thus I’m not sure they caught on.

    So now GPTs let you specify (pre-prompt?) different contexts and neatly turn them into separate tools for different purposes. Importantly, you can now also upload knowledge in the form of files and documents for the agents’ reference in generating replies. This makes them more powerful and app-like, and normal people like me with no coding ability can create them by telling a bot what they want (in natural language, of course), or writing prompts directly. I recommend the latter, because chatting with the “Create” front-end tends to oversimplify your instructions over time and you risk losing a lot of detail about how you want it to work and interact with users.

    So what does the launch of these GPTs mean? Well, for many of the developers who were riding the OpenAI wave and only used their APIs to build simplistic wrapper apps, it’s a sudden shift in the tide and they’re now forced to build things that aren’t reducible to mere prompts.

    What we’ll soon see is a GPT gold rush. Brace yourself for a stampede of AI prospectors, each hunting for their piece of OpenAI’s bonanza — the company will be curating and offering GPTs in a “Store” and sharing revenue with creators. That’s a different model than their APIs where developers pay OpenAI for compute and charge users in turn. Here, users all pay OpenAI a flat fee for ChatGPT Plus and can use community-made GPTs all they want (within the rate limits).

    Hear everyone talking about a viral GPT that makes it so easy to do X? When you want to try it out, you’ll see a call-to-action to sign up for ChatGPT Plus. This signals to me that launching GPTs is a strategy to drive paid account conversion, which begins the lock-in that OpenAI needs in order to make ChatGPT the new OS for services, not unlike how WeChat is the base layer that runs China, regardless of whether you use iOS or Android. Eventually you won’t even need to know about or choose the GPTs you use; the master ChatGPT system will call them as necessary. We may not be headed for a screen-less future, but we’ll probably see an app-less one.

    My GPT projects

    Of course I’m playing with this and making some of my own! Did you think I wouldn’t, given the ability to create AI things without coding?

    I’ve got a list of ideas to work on, and so far I’ve acted on three of them, which are explained on this blog in separate posts.

    ✨ PixelGenius was my first, and contains the most complex prompt I’ve ever written. It started out as a tool to generate photo editing presets/filters that you can use on your own in any sufficiently advanced photo editing app with curves, H/S/L controls, and color grading options. You can just say “I want to achieve the look of Fujifilm Astia slide film” and it’ll tell you how to do that. But now it does more than just make presets, which you can find out about here. More details and examples in the blog post here.

    😴 SleepyTales was the second, and I’m still amazed at how good it is. It’s designed for Voice Conversations mode (currently only in the mobile app), so you can get a realistic human voice reading you original (and also interactive, if desired) bedtime stories. These are never-ending, long, and absolutely boring tales with no real point, in drama-free settings, told in a cozy and peaceful manner. It’s the storytelling equivalent of watching paint dry, yet oddly mesmerizing. More on this and the next one here.

    🥱 SleepyKills 🔪 was born from a hilarious misread — I told Cien about it and ‘mundane’ became ‘murder’. So if your bedtime stories of choice are usually true crime podcasts, then you’re in luck. This GPT agent will create an infinite number of dreary murder stories, but stripped of all suspense, mystery, and excitement. They’re about as exciting as real police work, not the flashy TV investigating sort. Again, I still can’t believe how cool it is to hear these being written and read in real time.

    People have said the Voice Conversations feature is a game-changer for ChatGPT, but I didn’t really get it at first when using it for general queries. IMO, the killer app for it is storytelling. I’ve been using the voice called Sky for both the above bedtime stories apps, and it works well.

    Films

    • I watched David Fincher’s new film The Killer in bed on my iPad, just like he would want me to. Even then, it was spectacular, a cinematic victory lap for both him and Michael Fassbender. It plays with genre conventions, expectations, and riffs off his own body of work. There are some great moments and a fantastic performance by Tilda Swinton. 4.5 stars.
    • Speaking of performances by English actors, I also watched Guy Ritchie’s Operation Fortune: Ruse de Guerre, which is both a terrible name and attempt at creating a new globetrotting spy/special ops team franchise. But, he has a certain touch even when making shit, and the film is a hell of a lot of fun, bringing out the best in Jason Statham (who tried to hold up The Expendables 4 and failed), as well as a villainous turn from Hugh Grant that — I shit you not — is easily a Top 10 career highlight for him. Jason Statham in the right hands is a very different animal than when he’s doing B material; I don’t know how to explain it. I actually gave it 4 stars on Letterboxd and won’t take it back.

    Album of the week

    REM’s Up received a 25th Anniversary Edition, with some tracks seemingly remastered and a whole second “disc” of an unreleased live performance they recorded on the set of the TV show Party of Five?! Sadly it is not a track-for-track live performance of the album, which would have been great. There’s no Dolby Atmos here either, so I’m just taking this as an opportunity to revisit this album.

    I can still feel the gut punch from the day Bill Berry bowed out, post-aneurysm. I was afraid they might break up, and REM was absolutely my favorite band back then (maybe still), so when Up came out, I was hopeful for a new and long-lived chapter to begin. And yeah, it was a weird album, playing with new sounds and using drum machines — not unlike The Smashing Pumpkins’ Adore album after Jimmy Chamberlin left. But many songs were great, some even recognizably REM. The band kept going for a few more albums, each a new spin on an evolving sound. And in true style, they dropped the mic at just the right moment.

  • Two storytelling GPTs: SleepyTales and SleepyKills

    😴 SleepyTales: Spins long and boring stories to help you unwind and fall asleep. Designed for voice mode, turn it on and chill…

    This is a GPT designed to be used with ChatGPT’s “Voice Conversations” mode (currently only in the mobile app) — although you can use it to generate text alone, it really shines when paired with one of their realistic voices. I currently prefer the one called Sky. Like it says in the description above, this GPT agent has been prompted to provide tension-free, inconsequential, meandering stories about anything you like. It reads them out in a slow, gentle manner, for quite awhile at a stretch.

    So just turn on voice mode and pop your phone on the nightstand and listen to the most boring stories ever. Unfortunately, I’m unable to make it speak indefinitely without building an app, so it will occasionally stop and ask if it should keep going. You can say “yeah” or even “mmhmm”, and it will. Or you can give it some direction. Hint: just try and get it to make the story more exciting, I don’t think you’ll succeed!

    And I suppose if you’ve nodded off and can’t tell it to continue, that’s a good thing? Nevertheless, I find its stories very good just for unwinding while still awake.

    🥱 SleepyKills 🔪: A generative true crime podcast that couldn’t be more boring. Sleep tight!

    While showing the former app to Cien, she misread “mundane” as “murder” and thought it generated boring true crime stories, to which I thought “WHY NOT!?”

    And so SleepyKills was born, designed to emulate the language and style of a popular true crime podcast except… you might find it very hard to care? Firstly because the murder stories are completely generative and fictional, and secondly because they’re almost comically full of irrelevant details and lacking in any excitement or suspense. The AI podcaster often spends time on aspects of the case that no one else would want to know.

    Check it out if that sounds like your kind of bedtime story.

  • Introducing ✨PixelGenius GPT: An AI photo editing expert

    Do you edit photos, use filters, or make your own presets? What if you had an AI tool to help create any look you asked for?

    That’s ✨PixelGenius, my first “GPT” (a custom agent built on ChatGPT). It’s a photo editing expert that creates filters, suggests improvements, and helps you elevate your craft.

    • Describe a vibe and it’ll provide the settings to make a preset/filter.
    • Emulate a classic film stock!
    • Upload photos and get editing suggestions.
    • Reverse-engineer edited photos by providing a Before and After.
    • Learn editing techniques just by chatting naturally.

    It’s designed to help beginners learn the art and color science of photo editing, while letting pros save time with great starting points. For every adjustment, it explains the intent so you learn how this stuff works.

    It gives you standard adjustment values that you can plug into your favorite photo editing app like Darkroom, VSCO, Photomator, or Adobe Lightroom and save them as your own custom presets.

    I prefer to learn by trying stuff out rather than watching videos or whatever, so when I first started using Lightroom, it was a messy process of trial and error that lasted years. ✨PixelGenius turns that into an interactive, guided experience. It’s like having a photo editing expert on demand, and you can even get into deep conversations about color theory and photographic history. All you need is a ChatGPT Plus account.

    This involved writing one of the most comprehensive prompts I’ve done so far, so I’d be curious to know your thoughts after you give it a go!

    ✨Polaroid 600 adjustments
    A dramatic look created with ✨PresetGenius
  • Week 44.23

    Week 44.23

    I’ve been on the edge of a flu, with intermittent fatigue and headaches and a warm scratchy feeling at the back of my throat that makes me remember being ill and nauseous, but it hasn’t gone full blown. Maybe I’ve actually got the flu, but the vaccine I got a few weeks ago has inspired my immune system to resist and now my body is locked in a hundred-year war. I write this on Saturday with a full-day social test (wedding party) to attend tomorrow that will probably push me over if this doesn’t get better.

    While on the subject of health: I suppose you’re officially old when you buy yourself a blood pressure monitor. It was a conversation about strokes that got me on it, and it was a very quick impulse purchase that went from idea to research to purchase in under half an hour.

    I think this is the Omron model I got. I didn’t know they made them this small nowadays, not to mention that you can measure BP from a wrist! It connects to your phone via Bluetooth, and the Omron Connect app also syncs with the Apple Health app — which was the selling point for me. Omron’s app looks overly complicated and isn’t very pleasant to use, but it doesn’t matter since you can just overanalyze and freak out over your data more comfortably in Apple Health alongside your other health metrics.

    ===

    The only music news of the week that mattered was the release of the final Beatles song, Now and Then. This was the third and last John Lennon demo on the tape that gave us Free as a Bird and Real Love back in 1995. The audio quality on this one wasn’t good enough for it to be finished back then, but now it’s relatively trivial to separate vocals from instruments using tools built on machine learning — one music YouTuber reviewing the song literally demonstrates it himself using an online service — and so Paul and Ringo were finally able to complete the song using guitar bits George recorded in the ’95 sessions, making it probably the last song we’ll ever get with all four Beatles on it.

    It’s a lovely song and I’m glad we’re around to enjoy this historic moment of celebration and closure. I don’t mind posthumous vault releases as long as they’re done with love, care, and good intentions, and the short film above goes to lengths to assure everyone that John would have gotten a kick out of this. Real Love is one of my all-time favorites, just for the beautiful melody in its chorus and refrain, and the existence of these three songs together are like a treasure from a parallel universe where the Beatles never broke up (a scenario that the Apple TV+ show For All Mankind tantalizingly visualizes for a moment in one episode). It’s extra heartbreaking that all three songs read to me like products of John’s regret and wish for reconciliation.

    The incredible clarity they were able to get out of the tape recording, though, makes me want new versions of Free as a Bird and Real Love, remastered with modern technology. I don’t care who complains about opportunism or George Lucas-ism, it should just be done to close the chapter off neatly and in the best possible way for fans. Get it done, money men!

    ===

    Other bits:

    • Normally when you see too many sequels and the drawing out of stories, it comes with lowered quality, formulaic laziness, and/or the jumping of sharks, but Only Murders in the Building topped itself with the third season and now I can’t wait for a fourth. (Spoiler) I didn’t expect them to really go down the musical route with proper abandon, but they did and that bloody Pickwick Triplets patter song was stuck in my head for days. And they only got bloody Meryl Streep to be part of it, Christ.
    • Okay, but you know what IS a scummy money grab? The Backbone controller company pushing their old designed-for-Android USB-C models at the launch of the new iPhone last month, telling early adopters to step right up and get them (and then messing up the release so many of us received ones without the iPhone-supporting firmware), KNOWING FULL WELL they had a 2nd-generation model waiting in the wings that would support the new iPhones even better! Old inventory cleared at full price, the new model then quietly dropped, with redesigned dimensions that mean the camera bump no longer presses up against the chassis, bending it, and even supports being used with a case on. I was a big supporter of their work, but no longer. They’ve apparently been deleting critical posts from their subreddit, if you can believe such foolishness.
    • Three months ago I switched mobile telcos from Circles to M1, lured by a bigger data package for the same price. Shortly after that, M1 migrated many users to new plans (it was not a very smooth process either, fraught with confusion and poor communications), and sort of reneged on a basic tenet of my “contract” (technically it’s a contract-free plan): once free, 5G would now be a paid add-on after six months. Fudge that, I said, and now I’m back with Circles for (yet again) even more data and a lower monthly price to boot. The porting process was also flawless compared to my experience moving to M1.

    While looking for the above link to my own recent post, I chanced upon older entries talking about local telcos and got sucked into reading notes from my younger self. It’s one of the greatest joys of keeping a blog, and yet I rarely take the time to. I’ll post a few links now.

    • As the iPhone and Android wars heated up, I asked in 2015 what telcos could possibly be thinking by advertising Xiaomi devices alongside iPhones in their weekly newspaper advertising spreads. I said they were legitimizing cheaper Chinese devices that customers could easily buy through other retail channels for a couple hundred bucks, which would come back around to hurt telcos by dispelling the idea that one should sign a two-year contract (with high margins baked in) to get a good phone. I think I was right? Who gets a phone with a contract these days?
    • Back in 2006, I noted the opening of the Fifth Avenue Apple Store in New York and called it the most beautiful storefront I’d ever seen and wanted to visit someday — it would be 10 years before I did. And then there was this post from 2017 when Singapore finally got our first Apple Store.
    • Reader, I was even there when the fresh and ultra-luxe ION Orchard mall opened its doors in July 2009 — a fact that seems mindboggling today; hasn’t that building been there forever? At the recent team barbecue at a colleague’s condo overlooking Orchard Road, we were discussing some of the visible buildings and I discovered that our youngest team members are so young they don’t remember how the site of ION Orchard used to be a grassy mound of park-like open space. They were rightfully incensed when told that it used to be a popular picnic spot for Singapore’s domestic helpers until ION’s construction drove them away.
    • That just reminded me of the famous murder case where a bag of body parts was found dumped near that park.
    • In a 2016 post, I said that the future of gaming looked like cloud saves, cross-platform compatibility, and game designs that allowed you to play for both hours on a console or minutes on mobile. Back then, my signal was universal binary games on Apple TV that also ran on your iPhone. In 2019, Apple Arcade launched and that model was a core requirement for developers: all games had to support Mac, Apple TV, and iOS. And this week, Resident Evil Village launched for iPhone 15 Pro, as well as Macs and iPads with M1 chips or newer, the first in a new wave of console-quality titles you can both play at home and on-the-go. I think it’s a direct threat to whatever the Switch’s successor will offer, but the picture won’t be complete for a few more years.
    • Reading my posts from the year of living well (on sabbatical) is so bittersweet. On one hand, I was a bum reading books, watching films, and drawing all day. On the other, it was not unfulfilling? The little bit at the end of this weekly update from Jan 2022 reminded me how great a game Disco Elysium was and that I should replay it someday soon.

    (This week’s featured image was created by DALL•E from the idea of “The Beatles Resurrections”)