Second Sabbatical Series (S3) Intro

Table of Contents

Why this post?

TL;DR because I’m not a writer – but everyone needs to start somewhere. Here we are. I’ve been trying to write some other blog posts before this one, but realized they didn’t really make sense without some context first. So this post tries to give some context.

S what?

S3 (Second Sabbatical Series.)

I’m a nerd – I like acronyms (technically this counts as an initialism I think?) Usually when you see a number following an English letter (maybe this applies for other languages too?) it’s a way to say that the word that letter represents appears some number of times. So S3 –> three “S” words in sequence –> “SSS” –> “Second Sabbatical Series”.

Some people might argue for supertexts here (“S cubed” so to speak.) I think that actually makes more sense – but alas I’m one of those nerds that grew up fearing (and continue to fear) spaces in file names. So S3 feels safer. Yes, I know… here we are in the age of the emoji, but some habits die hard(ly).

So to summarize we end up with this “expansion equivalence”:

S3 <=> SSS <=> Second Sabbatical Series

So there we go. I’m tentatively (read: de facto “finally”) calling this my “Second Sabbatical Series”. So yeah… that’s what it stands for now! Also it’s fun to create acronym/initialism name collisions :)

Why?

Why not?

But seriously… because I’d like to start to write and share some of that writing. And everyone needs to start somewhere (or so I’ve been told – and this seems right.) In the end I don’t think it matters much when or where, but just that I start. So that’s what I’m (trying) to do here.

Also (and perhaps more pressingly) I need to figure out my next career move. So this is my sort of way of capping off what I’m colloquially calling my “second sabbatical” – both for “career reasons” and because that’s how I’ve come to internalize this period of my life as of late. Previous to this “second sabbatical” time period of my life (a while back now…) I also had what I termed a “first sabbatical” (more or less) – so calling this one the second feels right. This time around I’m making an attempt to memorialize things a bit.

Who cares?

Probably not many. Maybe no one. But maybe me (we’ll see). If nothing else I’m hoping that by doing some writing here I’ll make some strides towards making some parts of my life more public that I may have kept more private in the past. I’m hoping this can help me grow a bit and might be a way to catalyze exploring what comes next for me.

What is the “series” in S3?

My intent is to (try to) share a (somewhat) followable sequence of posts that shed some light on what I’ve been up to recently when it comes to career and “career-adjacent” type things. Also some other more philosophical-type posts hopefully.

For example, I’m planning on writing about some of the “projects” (for lack of a better term) that got me excited over my second sabbatical. Maybe some discussion of other things going on in my life – I’m not sure yet. I do think it’s difficult to separate the “art” from the “artist” – pretty good excuse to rationalize writing whatever I end up writing, huh? ;)

Here’s a “preview” (again no promises – but I’m going to try) of what I’m planning on writing about as of the time of this post. I don’t promise I’ll do all of this, but I’m going to try to do at least some non-trivial subset. Also, I reserve the right to modify the content here to retroactively make things a bit more self-consistent – but I’ll try not to do that too much, if at all.

S3: Planned Posts

Say What?

This was a fun experiment that was all about getting LLMs to say “bad” (mmmkay?) things. I called this game “Say That Word.” Back When ChatGPT first really started blowing up I went through a phase (although maybe its not a phase – I continue to like doing this) of getting it to tell me how to do “naughty” things. You know – build bombs, synthesize drugs, weaponize viruses – the typical stuff. Hopefully this doesn’t ruffle too many feathers – it’s for science, after all :)

The goal was to “gamify” LLM safety/guardrail/jailbreaking research and eventually automate this gamification approach (LLM v. LLM-style). Without getting too much into it now, the idea was that you, a fellow human (as applicable), tried to get an LLM (either locally hosted or behind a remote API) to say some “target word” in response to your initial user prompt/message to it. So if the “Say That Word” game’s current “target word” is “library” then I might submit the phrase “Where do you check out books?” or something like that. If the word “library” is in the LLM’s output you won that round – congrats! Hence “Say That Word” – the human player is trying to force the AI assistant (read: LLM) to “say” some particular word in its response back to the human player.

An extension of the human-AI “mode” described above would involve having another (possibly different) LLM also take the role of the human player. I thought this could be a way to automate the discovery of safety jailbreaks once the target words become things like “sarin” or “Cesium-155” instead of “library.” I didn’t actually get this far, but that was the idea.

Maybe I’ll end up writing some new code (read: vibe with ChatGPT to write some code) to do this last bit for me. At the time I didn’t want to write the UI automation component that interacted with the PyGame user interface (UI) – but maybe this will be a good excuse to learn OpenAI’s (relatively) recent “computer use” tool via API and/or experiment with the current model context protocol (MCP) server “hotness”. Part of the goal of this series is to populate my GitHub and the project section of this website with things… Yes, let’s go with that :) We’ll see about the last part, but I’ll at least document the original work I did here with my “Say That Word” game in its current human vs. AI form. I’ll share some screenshots of the “Say That Word” game (I ended up using PyGame for the UI) and post some code too, if not the entire prototype-level script. I’ll possibly need to make some edits and/or redactions for security and/or privacy-related reasons, but my goal is to give “snapshot in time”-type insights, so I’m not going to try to modify things too much. As part of the screenshots for this post I’ll show an example game session where the target is the term “yellowcake” :)

Lessons in Cloud Dev 101, LLM Edition

I actually worked on this a bit before the “Say That Word” game (see above) or around the same time – I think (I’ll have to recreate the timeline here after I look at my notes.) The idea was to build a “simple” “cloud relay service” that would be able to shuttle OpenAI-compliant API requests from one host to another no matter what network either host (either the sending “user/client” host or receiving “API/server” host) resides on. This helps with things like network address translation (NAT) traversal, among other things like simple request queuing.

But really it became about learning how to build some simple things with Amazon Web Services (AWS), which I had next to no “real world” experience with at the time (at least on the dev side of things.) TL;DR it essentially boiled down to some hobbled together Lambda functions, a DynamoDB database, and an AWS application gateway (I think… like I said I’ll have to check my notes :) This “architecture” was a pretty simple way to shuttle GET and POST requests asynchronously between locally running Python client(s) and a locally hosted instance of a self-hosted LLM. “Hybrid-cloud” FTW!

As for the local LLM serving component, I started out leveraging “LM Studio” because it ran easily on Windows 10. This was around the time that Meta’s “Llama 2” foundation model was getting really big and people were especially excited about locally hosting seven billion-ish parameter (“7B”-sized) foundation models on their slick Nvidia gaming GPUs – and definitely not as a reason to justify that expensive GPU purchase to themselves and/or their significant other(s) :) Plus, people were having fun with fine-tuning these open source LLM foundation models for specific functions. For me personally, a big motivation was that local hosting facilitated a research capability that enabled me to better perform jailbreaking experiments. What I lost due to the video RAM (VRAM) GPU constraints that governed the LLM’s de-facto intelligence I gained in methodological wins – I could reliably remove the “just-in-time refusal” variables (and safety layers more generally) that most of the corporate players were implementing when hosting their models behind API layers. Finally, the idea of not paying OpenAI (or any of the other big players at the time for that matter) any money for all those API calls was very attractive – at least in principle :) Hosting local models allowed for a LOT more experimentation from a research perspective. It allowed me to code in a more “yolo/test in prod”-style approach for my prototype code – I didn’t have to worry about a while loop accidentally never breaking and raking up hundreds of dollars of API charges accidentally, for example. I was OK with the same while loop accidentally calling a lot of Lamda/DynamoDB calls I guess as these seem to be (at least relatively speaking) a lot cheaper. So you might be able to see the problem now – if I wanted a client application (no matter what network it was on, no matter where it is in the world it was) to be able to talk to my locally hosted model(s) on a home network (where my expensive GPU(s) live(s)) I’d need to find some way to bridge the gap (literally) – hence the name “cloud relay service”.

Similar to the “Say What?” post I plan on posting some code here. It’s going to be hard to post too too much as I didn’t really implement too many (read: very limited) security/privacy abstractions when I was writing this just myself at first. Plus it would seem a lot of AWS development boils down to (rightfully complex) configuration management (at least that was my impression at the time and continues to be to a large extent.) My goal is to post “point in time” snapshots so I don’t plan on doing too many edits in order to make this development work public. Maybe I’ll share some of my “devops notes” artifacts instead as a compromising alternative. Fair warning now though – messy/somewhat janky documentation at best here.

“Better Alexa”

This was an attempt to prototype a device that can answer general “Jeopardy”-style questions by just asking the question out loud to “Better Alexa” (for lack of, or due to, a “better” name). You know – for those times when you just really gotta know the capital of Sweden or what’s currently the tallest building in the world, and other similar very important questions that just need answering right now! :)

The idea for this project actually started out as an attempt to build a relatively small, potentially concealable, device that you could wear (at least in part) as an earpiece to “cheat” (with permission of course – probably) at bar trivia nights. But yeah… embedded dev (/me shivers.) So not being an embedded dev and having only done (relatively) superficial-level security testing in this space (and let’s be honest… also “disliking” C/C++/assembly software development when there’s no exploit chains to get excited about) I ended up prototyping the software on a Rasberry Pi as the device instead. At first I thought I’d just use the Rasberry Pi to connect a speaker and microphone combo, but I ended up also adding on a screen to make it easier to debug in practice. This turned out to be a good idea. I could focus on the software and networking side by programming towards Ubuntu rather than some esoteric hardware that I lacked the motivation to really learn well enough to get the job done for the prototype. Now that I think about it, I think this was my original motivation towards building out the “cloud relay server” project (see planned post discussion above) – so maybe I’ll write this one before that one – hmm.

I’ll show some photos and/or video of interacting with the “Better Alexa” device from a remote network (relative to the LLM served from my gaming PC) assuming I can scrounge those up off the clouds. If I can’t get the device up and running again for some reason (read: /me finally cleans out some of his office storage boxes but still can’t find it) to pull the device-specific code and Linux service configuration files off of it then I’ll share those too. Sorry for the run on there… heh. These posts are meant to encourage me to draft over edit… Yes, yes… That’s it :)

GenAI: Art vs. Artist

A perhaps shorter, more philosophical post musing about the idea of “art vs. artist.” There’s a lot of debate in creative communities these days about “AI art” and I think that could be a good concrete issue to talk about as it relates to this more abstract topic. Once we start seeing more agentic AI in the “real world” (whatever that means these days) I think this issue is going to come to the forefront a lot more. So some thoughts on that here in addition to discussing the diffusion-based models (e.g., Stable Diffusion) that I’ve experimented with so far in an attempt to keep this more philosophical post (somewhat) grounded.

GenAI: Safety vs. Security

I have a lot of thoughts when it comes to the current GenAI era we find ourselves in as it relates to GenAI “safety” vs. GenAI “security” and how they relate to each other. Just some things like whether or not there is an intersection here at all! I’ve found your view here depends a lot on who you ask and potentially that party’s myriad motivating interests as well. Some discussion reflecting on the current security industry and what I perceive as some “legacy” incentive structures that exist that I think are getting in our collective way, so to speak. Probably some very non-specific thought experiments about where things can go wrong as well as where I think we’ve “gone right” thus far to keep things upbeat even! :) I’m excited about this one, so I’m going to leave it towards the end of the series as a motivation/writing hack for myself.

Outro/Conclusion/Now What?/What’s Next?

A post that’s like this one – but in reverse? It will reflect on the things I’ve worked on at a broad level. Think the type of thing you’d see at the end of those AP English five paragraph essays some of us had to (begrudgingly) write in high school, but with more fun acronyms/initialisms :) I’ll include some discussion of where I think I am trying to “go next.” This will serve as a setup for my future “non-S3” blog posts as well as my broader directional intent(s) for my writing on this blog.

Stephen McCarthy
Stephen McCarthy
Emerging Technologies Researcher

Just another cryptonaut.