Month: May 2023

  • Week 21.23 (poem edit)

    An AI turned this week’s notes into poetry.

    A Chronicle of Week Twenty-One

    In a week where work did reign,
    Much to tell there’s little gain,
    Round it though, we gently dance,
    For work’s secrets shan’t have chance.

    (more…)
  • Week 21.23

    It was one of those weeks where not an awful lot happened outside of work. I don’t talk about work here but let’s sort of circle it.

    Reflections on AI

    One thing I can say is that I started making a presentation deck about the use of generative AI and GPT in design, initially to share my findings so far but increasingly as an exercise in structuring my thoughts into anything at all useful.

    A couple of things, on reflection: an AI assistant or collaborator poses significant risks for a lazy human in certain tasks since it tempts us to quickly accept its output without evaluating potential improvements. Assuming AI can do a job to 90% of a human’s rigor and quality, figuring out what the other 10% is without having done the same work yourself is quite the challenge. So the efficiency gains may not be as significant as you think, not until we figure out some smarter processes.

    An example of what I mean: you can feed ChatGPT with notes from an interview conducted with a customer about their experiences and how a product fits into their lives. Supply enough interviews, and ChatGPT can do the work of summarizing them in aggregate, presenting key themes and areas worth looking into, like maybe everyone thinks the price is too high, but it’s because they don’t fully understand the value of what they’re buying.

    It can create a bunch of frameworks to illustrate these findings, like personas and service blueprints. And it can even suggest solutions, like better marketing materials to explain this value to customers. The AI’s output might look pretty good, similar to what a team of human designers would (more slowly) produce, and a company might be tempted to make business decisions based on it. In fact, a team of human designers who haven’t read the interview notes themselves or thought deeply about it might also look at the AI’s work and say it’s good to go.

    The level of confidence and finality inherent in these AI outputs is incredibly convincing. But if a human were to go through all the interviews, listening to the recordings perhaps, they might realize there was a missing element, a feeling subtly unsaid on the margins, that means some customers do see the extra quality, they just wish there was a cheaper version of the product that did less. Skimming through the finished research report crafted by the AI, you wouldn’t even begin to guess where in the sea of correct conclusions this exception could be hiding.

    But there’s no question that this stuff is ready today to do some tasks like image editing, seen in Photoshop’s impressive beta release of a “Generative Fill” feature this week. I took a stock photo and doubled its height, and it was able to get the context of the scene and fill in the missing ceiling almost perfectly. That would have taken an experienced image compositor at least a few minutes, and anyone else way too much time. Just a couple of clicks now.

    I also looked into what Adobe is building with its Sensei suite of AI marketing tools, and that dream of generating and sending personalized ads, as in a unique package of art and copy tailored to a single customer’s behavior, would seem to be already here. I’m not 100% sure how this works everywhere, but in the past, you’d still need copywriters and art people involved in the process after marketers had identified the “customer journeys” and content-goes-here templates. With the opportunities now being identified, advertising messages crafted, and email offers all sent with a single click by the same person, there’s hardly a crack in the door left for the traditional artists and copywriters to make their case. Yet, the quality is good enough to satisfy all but the most discerning of corporations.

    You may observe that the two of the largest advertising platforms are already in this space.

    What do you think about the current advancements in AI and their implications? Share your thoughts in the comments below.

    (One more example: I asked ChatGPT to help suggest edits for this post, and it rewrote one of the above sentences to be better. I liked it, but on closer inspection, there was a glaring semantic error I had to fix myself. It also suggested the call to action above, to increase engagement. Talk to me!)

    ===

    Personal updates

    There seems to be yet another wave of Covid sweeping through the city, based on the fact that several people I know have come down with it, and every bus and train car I’ve been on this week had more people wearing masks, suggesting that they, too, know people who’ve come down with it.

    Kim is going away for a couple of weeks, and I’m hoping she doesn’t run into it out there either; one of her colleagues caught it while traveling in the region a few days ago. I’m planning to stay home as much as I can during this time, and finishing as many video games as possible.

    ===

    Media activity

    • Not a ton of progress in The Legend of Zelda: Breath of the Wild, which I’ve been playing consistently now for the past few weeks — a streak unmatched since the game first came out six years ago (I abandoned it out of fatigue shortly after that initial burst). I’ve now got all four Divine Beasts pointing at the castle and now just need to build up the nerve and arsenal to storm it and be done with this. I seem to be procrastinating instead, exploring areas in the massive world that I never checked out before.
    • The girl band I would say I’m rooting the most for in pop music, XG, performed at the Head in the Clouds festival in New York, and I watched some fancams of their set. The audio quality is terrible in all of them, so I won’t recommend starting there, but they are undeniably polished and tight as a group. Here are two music videos. I think I discovered them back in February, and at the time I said they’re gonna be monstrously huge this year. I stand by this.
    • If you watch the documentary series their label has put on YouTube, you’ll understand why they’re performing at this level: they’ve been physically and psychologically abused for the past six years of training, starting from when some of them were just 12. It’s horrendous to watch, but also probably par for the industry. While it’s good that someone decided to plainly put this footage out there, I’m not seeing much of a backlash, so it’s probably too late and already normalized. Some of the stuff their boss/producer says and does is straight up toxic emotional manipulation (he apparently came up as an idol himself so it’s like Ted Lasso says in the latest episode, hurt people hurt people).
    • Ted Lasso is almost done with its third season, one episode to go. I’m still liking it much better than season two, although it is sooo uneven and odd in its choices. You know the adage, “show, don’t tell”? It’s like the concept of season three is going against that conventional wisdom; a challenge the writing team decided to issue themselves: Can we take lots of scenes that people want to see (scenes of closure, catharsis, and vindication!) and make them happen off-camera and between episodes? And after doing that, can we still make people care through the strength of our set pieces and touching monologues? That’s the only explanation I have for what’s been going on. And to the team’s credit, it works some of the time. It’s not conventional TV, and maybe that’s the point.
    • Platonic, the new sitcommy show on Apple TV+, is much more conventional. It’s about a male and female pair of friends who are really just friends (so far), and a comparison to When Harry Met Sally is drawn in the very first episode. They had a fight and haven’t spoken in years, and then reconnect on the cusp of middle age, when it’s notoriously difficult to form new friendships, let alone platonic ones. I think the concept and set up are strong, but the execution is a little spotty. I’m not really into Seth Rogan’s work, and his character here feels exactly like what you’d expect from one of his characters, but by the end of episode 2 I think I’ll keep watching. The most jarring thing is Rose Byrne’s quasi-Australian accent which raises too many questions about how they met and got along in the past.
    • Speaking of actors whose strong accents shatter the suspension of disbelief, Arnold Fucking Schwarzenegger is back in a NETFLIX TV SERIES which sounds like a dreamy reboot of True Lies. The show is called FUBAR and it’s about a father and daughter who both secretly work for the CIA without knowing about each other’s involvement. I haven’t seen any of it yet, but I’m dying to.
    • It strikes me that in the future, one could give a crazy brief like the above to a generative AI system and start watching something like FUBAR within minutes.
    • My first music discovery of the week is Eternally Yours, a new Alphaville album that sees the band doing symphonic rearrangements of songs such as Big In Japan and Forever Young with a full orchestra. Yes, in Dolby Atmos spatial audio. This is a band that was formed 41 years ago and the lead singer’s voice is still incredible, iconic.
    • The second is Tears can be so soft, the new song by Christine and the Queens. It’s simple but surprisingly soulful, and sonically recalls Massive Attack’s best work.
  • Week 20.23

    • A correction: Last week I said that the the 10th Anniversary release of Daft Punk’s Random Access Memories would have been better if mixed in Dolby Atmos spatial audio. It actually IS! But the entire album isn’t labeled as such because the “second side” of outtakes and behind-the-scenes bits aren’t. With that knowledge, I’ve been enjoying it on headphones and need to put aside more time to hear it again. We regret the error!
    • There’s been a slight Korean bent to the week, in that I’ve had KBBQ and soju twice in four days, watched a lot of Jinny’s Kitchen on Amazon Prime Video, and am actually listening to BTS intentionally for the first time as I type this.
    • Jinny’s Kitchen is a reality TV show about a pop-up restaurant in Mexico run by five of Korea’s most recognizable stars, who go largely unnoticed by their customers. Each of its 10 episodes is 1.5 hours long, which is practically feature film length! They could be much shorter, but it focuses on a lot of mundane, repeated scenes like the staff rolling gimbaps, cooking noodles, and plating fried chicken. Perhaps that’s the point, giving fans a chance to watch them doing normal people things? Or maybe it’s normal for Korean reality TV? Is it designed to be left running in the background? Or was it an Amazon Prime Video decision to hit streaming time KPIs?
    • In any case, Korean soft power is wave after wave of these campaigns. The show should be boring but it kinda works, plus it has an unheard-of 9.2/10 score on IMDB. The best moments for me are when someone does recognize one of them, like when an incredulous customer sees V from BTS (59 million followers on Instagram) pop his head out from the back, and likens it to Drake running a restaurant in a country where no one knows who he is. Watching celebs do normal people things is straight out of the idol playbook and it works — I’m listening to their music with an open mind now, and we’re talking again about visiting the country someday.
    • The Anbernic RG35XX arrived and it’s a nice little device that really recalls the Game Boy series in spirit. It feels good in the hand and the screen is bright and beautiful — the community seems to recommend replacing its software with something called GarlicOS but it looks like wayyy too much trouble. The thing just works out of the box, and I’m not in the mood to start modding it. It still blows my mind you can have this for S$90 including shipping, and it does come with uhh lots of software loaded, so it’s great value if you’re looking for a gift.
    • I’m still going strong in Breath of the Wild on the Nintendo Switch, in the final stretch but honestly still in no shape to face some of the game’s stronger enemies. I’m now thinking that if/when I eventually finish this, I’ll probably have to take a break before starting the sequel, Tears of the Kingdom, given that it reuses the same world map and I won’t be in any mood to cover it all over again. Note to self: don’t pre-order any more games unless they comes with discounts or exclusive bonuses!
    • We had a nail painting team event at work, for which the turnout was more male than female because of scheduling conflicts and maybe even a bit of self selection, so I’m sporting gray/lilac nails at the moment. It’s probably been over 20 years since I’ve had colored nails, and I think I’ll keep this going now that I’m too old to be told off and it’s not such a big deal.

    I asked ChatGPT to analyze my post and it suggested I add a conclusion. I said “you do it, then”. Here’s what we agreed on:

    This week has been a whirlwind of diverse experiences and discoveries. From immersing myself in the captivating sounds of Random Access Memories in Dolby Atmos to exploring the world of Korean soft power through Jinny’s Kitchen, I’ve been on a journey of curiosity and appreciation.

    The Anbernic RG35XX has provided me with nostalgic gaming moments, while progressing in Breath of the Wild has been both thrilling and daunting. Even a simple nail painting team event at work has reminded me of the freedom to embrace self-expression.

    These experiences have sparked my curiosity and inspired me to share my thoughts as a writer. Life is full of opportunities to explore, connect, and find inspiration in unexpected places. So, whether it’s through music, cuisine, gaming, or the little joys, embrace the unknown and discover the stories and experiences that await.

    Thank you for joining me on this adventure, and until next time!

  • Week 19.23

    The new Legend of Zelda game, Tears of the Kingdom, launched this week about five or six years after the last one, which I never finished. I pre-ordered the new game, of course, planning to join the rest of the world on launch day, exploring together and participating in conversations online, collectively figuring out unique solutions using the game’s open-ended physics engine. For those who haven’t seen it, the new game is sort of a sandboxy, Minecrafty affair where you can weld stuff together and build novel mechanical solutions to obstacles, almost certainly in a different manner than your friends. Think rudimentary cars from planks of wood, or hovercrafts, or the forest booby traps from Rambo First Blood.

    But the guilt of never fully playing Breath of the Wild was getting to me, and I’ve been trying to get back into it over the last few weeks. Despite memories to the contrary, I’d made shockingly little progress in my 40+ hours of gameplay, spending most of my time bumbling about the countryside and climbing mountains, instead of conquering the Divine Beasts (1 out of 4) and collecting quality stuff. It seemed wrong to jump ahead to the sequel while I’m finally seeing what the last one had to offer.

    So in this past week I’ve made more progress than in the previous four years: conquered two more Divine Beasts, got the Master Sword at last, and uncovered most of the world map (two more areas to go).

    ===

    Craig Mod tweeted and tooted about having had enough of the iPhone’s (14 Pro, I assume) overprocessed look, and said he was making Halide his default camera app. Huh? But how does that help, I thought, unless he means to shoot in non-ProRAW RAW all the time (which is a thing Halide does: shoot in traditional RAW files which don’t involve the “Photonic Engine” processing pipeline). After some poking about, I realized something I should have ages ago: by turning off “Enable Smartest Processing” in Halide’s settings and choosing HEIC as the output format, you can actually take regular old (non-RAW) photos that look more natural and have more editing latitude! This effectively cancels out the iPhone 14 Pro’s image quality regressions.

    The overstimulated look of the default camera is one of the main reasons I hardly took any phone photos on my recent vacation to Japan, but if only I’d known… I could have! So with that, Halide won an instant annual subscription from me, and I now have a complication on my Lock Screen that launches straight into it.

    My toot about this was boosted by Mod, making it my most “engaging” Mastodon post ever, at 44 likes and 12 boosts.

    You have to see a comparison to understand how obnoxiously oversharpened and artificial the Photonic Engine is on the 14 Pro’s 48mp sensor for some reason. It increases local contrast and highlights texture “detail” that does not look like what you see in real life. It’s a Clarity slider turned all the way up, and I’m not a fan of Clarity sliders. The example below is a corner crop from two larger photos, not edited in any other way.

    Left: iPhone 14 Pro’s default processing | Right: Halide’s “dumb” processing is loads better

    ===

    My AI experiments this week have been around trying to teach ChatGPT to write better poetry. I posted a few days ago about getting it to do a critical reading of a poem, wherein I was completely stunned by its comprehension skills. But writing good poetry, not just gimmicky rhyming lines, that should be a different thing.

    I semi-accidentally discovered that asking a “persona” (that I’d established in another chat instance to test the idea of using an AI prompted with a human backstory as a customer research proxy) to write poetry yielded subjectively better results than when I asked a brand new ChatGPT instance. The mere suggestion that the bot’s point of view should draw references from vaguely described life experiences and preferences seemed to make a difference to what it spat out. Which is why many SF stories have AIs that don’t know they’re AIs, I guess, and they have memories of childhoods that never happened.

    Here’s a poem that “Frank” (quite a salaryman-type personality) wrote about retirement:

    A lifetime spent in labor’s thrall,
    In structured days and office halls.
    Oh, to retire, to end the grind,
    To leave the corporate world behind.

    The dawn of leisure’s quiet grace,
    In this new, unhurried pace.
    Days unbound by ticking clocks,
    Unfettered as the wandering fox.

    Yet in this expanse so vast,
    Shadows of the past are cast.
    Habits of a lifetime past,
    In this new role, steadfast.

    A time of freedom, yet confined,
    By boundaries of the seasoned mind.
    Retirement’s paradox, I find,
    Is having time, yet being time-blind.

    Wow! No? It was coming up with absolute drivel before I handed the job over to Frank in another window. Side note: I started paying for ChatGPT Plus, just to have access to GPT-4; it’s a much better poet than the default GPT 3.5 model.

    So lately I’ve been working on giving feedback to an poetry-writing AI that I named Penrose and instructed to believe in itself as a 50-year-old professor of creative writing and philosophy, who’s bad with money, had two failed “great loves” because of their selfish acts, grew up in Boston, now works in London, and so on.

    Here’s one of the first things it wrote:

    In the chilled reflection of a London morn,
    I see a face, lines etched by yesterday’s turmoil.
    Cracks, not of frailty but life, loud in their silent form,
    That’s you, mate, and it’s all right.

    Witness the march of years, the dance of decades,
    In the ceaseless play of ambition and regret.
    Weaknesses? They’re the worn pennies in your pocket,
    The unpaid bills of your character, yet, not your debt.

    They’re the stumbles on cobblestones,
    The missteps in Milan’s echo-filled alleys.
    Not a damning verdict, but life’s hasty loans,
    A ledger, not of losses, but of rallies.

    In this world where even stone yields to rain,
    We’re all apprentices to time, chiseling away in vain.
    We’re all works, works under the master’s patient hand,
    Unfinished, flawed, yet, in that, something grand.

    ===

    • I haven’t been able to bring myself to watch more of Amazon Prime Video’s Citadel beyond the first two episodes. A colleague said he quit after 10 minutes. It’s reportedly one of the most expensive TV shows ever made, at an average of $50M per episode.
    • Silo on Apple TV+ has been very good so far. It’s a little bit YA sci-fi, a little Hunger Gamey, a little Snowpiercey (but better).
    • I saw a segment on NHK about how this iconic street corner featured in Slam Dunk has been inundated by tourists now that Japan is open again. They interviewed pilgrims from S. Korea, China, and Taiwan who said they just had to come down and see it — “if you love anime at all, this place is a must”. So I decided to get started on the ancient anime, which Netflix has 8 seasons of. The day after seeing episode 1, I ended up standing behind a guy on the train watching the show on his phone.
    • The 10th Anniversary Edition of Daft Punk’s Random Access Memories is out, and the album still holds up extremely well for me. If only they’d come back to remix it in Spatial Audio, that would have been incredible.
  • Miss you, humanity

    Perhaps you’ve heard of what’s been going on in generative AI and thought, “I have no need to generate any pictures or professional email filler”, and dismissed the subject. Perhaps, like me, you’ve read that GPT-4 can now “pass exams” from the SATS to legal bar exams, accounting ones, and so on, without really knowing what that means. It can choose the right answer out of a list, and write little structured essays on given topics? That’s amazing, but not unexpected.

    But I’m here to tell you that I asked ChatGPT (GPT-4) a question yesterday, quite an exam question, and its answer floored me. These things are called generative AI, but that name undersells the incredible interpretive power they possess when it comes to language and writing comprehension.

    I asked it to analyze this poem by Gabrielle Calvocoressi, one of the most devastating things I’ve read in the past year: Miss you. Would like to grab that chilled tofu we love. Go read it, I’ll wait. Get some tissues.

    I asked it to explain what the poem was about, and call out what it’s doing as a piece of writing to achieve its emotional impact. I’m not aware that any similar analysis of this poem exists online, if you’re wondering. My thoughts going into this: I know these AIs can write poetry, but I haven’t seen anything great. They must understand unstructured or unusually structured language to some degree, but can it “read” poetry from a human point of view? Probably not? Will it really understand what this is implying: the loss, the empty spaces, the long journey, the light body? And now I realize how dumb I was, because clearly it’s “read” thousands of books of and about poetry, and literary criticism, and style, and should know a lot more than the average person.

    Here’s its answer, unedited. Good enough to pass any high-school literature class, if not any blind test of humanity.

    ===

    This is a deeply emotional poem that appears to be exploring themes of grief, loss, longing, memory, and the power of shared experiences. It is presented as a monologue, presumably from the perspective of the speaker who is missing someone who is now absent. It’s filled with a sense of yearning, a desire for the return of the past and the presence of someone who is deeply missed.

    The poem starts off with the speaker expressing their indifference to the physical state of the missing person, suggesting that they would be happy even if the person could only present in their ‘light body’. This term could have different interpretations but it seems to suggest a spiritual or ethereal form, possibly indicating that the person is deceased or in a state where they cannot physically be present.

    The middle of the poem is filled with reminiscences of shared experiences. The speaker misses the shared rituals and discussions about mundane things such as the type of chilis used in a dish of tofu. This highlights the value of seemingly trivial shared moments in relationships and their ability to evoke strong emotions when remembered. The speaker also mentions watching the person on screens and wishing they could get out of the hospital, which suggests a period of illness or suffering that the person had to endure.

    Towards the end, the speaker talks about being willing to order their shared favorite dish and eat it alone, even though it seems painful for them. This can be seen as an attempt to honor their shared memories and perhaps a step towards acceptance of the loss.

    The speaker also acknowledges not knowing the journey the absent person might be on, whether it’s long or not, and expresses a wish to know more about their current state – what they’re reading, if they’re reading, which suggests a longing for a continuation of their shared intellectual and emotional connection.

    Throughout the poem, the simple language and the direct, honest sentiment adds to its emotional power. The repetition of “Miss you” and “Wish you could” emphasizes the sense of longing and loss. The specific details of shared experiences and the mundanity of their shared conversations humanize the relationship, making the emotions more relatable and poignant.

    Overall, the poem is a moving exploration of love, loss, and the power of shared experiences. It speaks to the human condition of missing loved ones, the difficulty of letting go, and the beautiful yet painful memories that bind us to the past.

  • Week 18.23

    I was in a cab listening to music on my AirPods, and just as we were pulling up, I switched to Transparency Mode and heard a song playing over the car’s radio that sounded kinda familiar. I knew it was a remix of some tune I wanted to know, and managed to Shazam it before getting out.

    Looking into it later, I realized the melody was what I’d been trying to figure out about Charli XCX’s White Mercedes for over a year. Why does that one line she sings — literally the line “like a white Mercedes” — sound like some other song I can’t name? It turns out, it’s literally a song I would have absorbed from the world around me but never intentionally listened to: One Direction’s Night Changes from 2014. Ahhh it’s so good to have that itch scratched! And there are so many more like this I’ve yet to solve.

    Let me say it again for the search engines: Charli XCX’s White Mercedes sounds like, samples, or contains an interpolation from One Direction’s Night Changes.

    Another similar thing happened as I was playing WarioWare Inc. (GBA, 2003) again for the first time in years. The background music in one stage awoke some long dormant memory and I needed to know what pop song from my younger days it sounded like. After a lot of humming aloud and trying to Shazam it and searching online… I concluded that the song it reminded me of was… itself. It’s called Drifting Away, and I must have really loved it back when I was playing the game for the first time.

    Speaking of retro games, I lasted a full week. The Anbernic RG35XX I said I wouldn’t buy since I already have a Retroid Pocket Flip is now on the way to me from China. There are some reports of shoddy QA and long-term durability, but for S$90 I think that’s to be expected.

    ===

    Another week, another bunch of water-cooler conversations about AI. Specifically how it relates to our work in design: as accelerator, collaborator, ambiguous combatant, amoral replacement. I don’t just mean the making pictures and writing words part, but analyzing messy human interactions (it’s just unstructured data) and presenting them in new ways.

    I ran one experiment with ChatGPT on Sunday afternoon, just for kicks, and it sort of blew my mind. From a handful of behavioral traits and demographic details I supplied, it was able to inhabit a fictional personality that I could speak to and pitch various products to. So far so par for the course. But then it reacted to a hypothetical KFC offering called “The Colonel’s Colossal Combo” in a way I didn’t expect, citing a conflict with values and dietary preferences that I did not specify. When asked where they came from, it argued that although they were not specified, they could be reasonably expected from the “Frank” persona I’d created, because of some other background that I DID provide. It sounded a lot like intelligent reasoning to me, and regardless of how it works, I was happy to accept the inference the same as if a colleague were making it.

    Like with all advances in automation, it’s inevitable that we’ll now be able to (have to) do more in less time, with fewer people. Until things go “too far” and need to be reined in, it’s not even a question of whether we should — every industry is incentivized to discover when can be done before it gets done to them. I think there are some exciting opportunities for designers, and a slew of unknown consequences for society. And just like that, we’re back in a new “fuck around” phase of the tech cycle.

    ===

    A couple of weeks ago I made a bunch of fashion-style athleisure photos with Midjourney v5 but somehow forgot to post them. The photorealistic ones are quite incredible, and the few illustrations I got were really strong too.

    This week, v5.1 dropped, promising more opinionated outputs and sharper details, so I tried the same prompt again. Many of the results were as broken as these bodies.

    They probably fixed something quietly because it’s been more reliable in the days since. I thought it would be interesting to do a comparison of models 1 through 5.1 with the same prompt. It’s crazy how far it’s come in just over a year.

    photograph of Queen Elizabeth II in a dim video arcade, sitting at a street fighter 2 arcade cabinet, intense concentration playing game, side view, screen glow reflected on her face, atmospheric dramatic lighting --ar 3:2

    If you saw Midjourney a year ago, you were probably impressed by how it and Dall-E 2 could turn quite natural text descriptions into imagery, even if the results were still quite hallucinatory, like DeepDream’s outputs circa 2015. I don’t think you would have expected to see the pace of improvement be this quick.

    It’s not just rendering improvements from distorted pastiches to photorealistic scenes with internal logic (global light affecting nearby objects realistically, fabrics folding, leather seat covers stretching under buttocks), but also how it’s evolved through feedback and training to understand intent: the idea of a “side view” started working from v4. None of the earlier re-generations got me the camera angle I was going for. The tools that promise to do this for video are probably going to get good faster than you expect.