# 🚀 Paperclip AI + OpenCode CLI + LM Studio + Qwen 3.6

Started by Theo Gottwald, April 17, 2026, 09:23:02 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Theo Gottwald

## A powerful local-first AI company stack for serious automation 🤖

Paperclip AI is one of the most interesting new tools for organizing AI agents into a real managed structure. Combined with OpenCode CLI, LM Studio, and Qwen 3.6, it opens the door to a serious local-first coding and automation workflow. 🤖💻

Here are some comments, a detailled description is below.

1. To use this combination you NEED the most actual LM-Studio 0.4.12 else the Computer may hang and even crash.
This is especially due to the fact that the parallel-processing option is heavily used.

LMS_02.png

2. The new  local Qwen 3.6 Model is perfect for such tasks, using a coding plan for this type of app is possibly expensive and the results are possibly questionable. I have it now running for a while and after several errors that needed to be fixed first (the program is new), my impression is that the fictional Company does not yet manage to organize themselves properly. Yet consuming a lot of Tokens.

PClip.png

3. The usable Contextsize with the new Qwen 3.6 Model is in total 262K Token.
In Multiprocessing - as we see on the pictures - LM-Studio will share the Context-Size between the processes.
And if the Paperclip runs 4 processes at the same time, its possible the 1/4 of 262000 is just not enough, see below.

PPC3.png

4. Is it a good idea?
Generally yes,, its a good idea.
I had that Idea some time ago, but could not make the implementation.
If this implementation is good enough, is another question.
I will let it run and use my local AI and see if it can do something useful.

5. Used together with the SindByte MCP-Server it could employ a number of virtual agents that could do trades on Kraken (just example9 and each of these could try to earn more then the other agents.
This could be an interesting to test combination later.

6. This program combination seems to work permanent (depending on Heartbeat (its a Timer)-Settings).
You ever wanted your computer to at least "do something" even when you sleep?
This is it. However if connected with a "Coding Plan" or API it will burn money from morning to evening.
So my recommendation is to only use this with Local AI unless you got it really do something that is worth the result.

Details.
If you have been looking for a way to move beyond "just chatting with an AI" and toward a **structured, controllable multi-agent workflow**, the new **Paperclip AI** stack is one of the most interesting developments right now. Paperclip is positioned as an **open-source orchestration platform for AI agent teams**, with a Node.js server and React UI that lets you organize agents via org charts, budgets, governance, tickets, and audit trails. In short: it is not just another single-agent assistant — it is designed to coordinate multiple agents toward business or project goals. ([GitHub][1])

## 🧠 What Paperclip actually is

Paperclip's core idea is simple but important: instead of running isolated AI terminals and losing track of who is doing what, it gives you a **central control layer**. According to its project pages, it supports concepts such as:

* **Bring your own agent**
* **Goal alignment**
* **Heartbeat-based wakeups**
* **Cost controls / monthly budgets**
* **Ticketing and audit logging**
* **Governance / approvals**
* **Org charts and role hierarchy**
* **Multi-company separation** ([GitHub][1])

That means Paperclip is best understood as a **management and orchestration shell** around other AI workers. It does not replace your coding agent — it coordinates and supervises it. This is especially useful when you want a CEO/CTO/developer/researcher style structure instead of a single monolithic assistant. ([GitHub][1])

## 💻 Why OpenCode CLI fits so well

This is where **OpenCode CLI** becomes very interesting. OpenCode is an **open-source coding agent** with both a TUI and CLI workflow, and it can be run interactively or programmatically. Its CLI supports commands such as `run`, `agent`, `models`, `mcp`, `serve`, `session`, `web`, and more, which makes it highly suitable as a worker engine inside a larger orchestration system. ([OpenCode][2])

In practical terms, OpenCode is a strong fit for Paperclip because:

* it already behaves like a terminal-native coding agent,
* it supports configurable agents,
* it can be driven in scripted or backend-style workflows,
* and it is designed to connect to different model providers. ([OpenCode][2])

So the combination looks like this:

* **Paperclip** = the management layer 🧾
* **OpenCode CLI** = the coding worker / execution layer 🛠�
* **LM Studio** = the local inference backend 🖥�
* **Qwen 3.6** = the model brain 🧠

That architecture is one of the cleanest current local-first setups for people who want real agent coordination rather than a single prompt box.

## 🏠 Why LM Studio is the key local piece

**LM Studio** is the part that makes the setup much more attractive for privacy-focused or hardware-heavy users. LM Studio explicitly positions itself as a way to **run AI models locally and privately**, with support for local hardware, an OpenAI-compatible API, and even a **headless deployment mode** via `llmster`. It also exposes developer resources, SDKs, and a CLI (`lms`). ([LM Studio][3])

This matters because OpenCode and similar coding tools work well when they can point to an **OpenAI-compatible local endpoint**. LM Studio provides exactly that. It also now promotes **LM Link**, which allows remote LM Studio instances to be used as if they were local; LM Studio explicitly says that tools already targeting the local LM Studio server can use LM Link models as well, including tools like **OpenCode**. ([LM Studio][3])

So if your goal is:

* **local inference**
* **less cloud dependency**
* **better privacy**
* **use of your own GPU hardware**
* **OpenAI-compatible access for agent tools**

then LM Studio is one of the most practical bridges currently available. ([LM Studio][3])

## 🔥 And now comes Qwen 3.6

The newest major model in this chain is **Qwen 3.6**. Qwen officially announced **Qwen3.6-Plus** on **April 1, 2026**, describing it as a model aimed at **real-world agents** with improvements in **coding agents, general agents, and tool usage**, specifically through tighter integration of reasoning, memory, and execution. ([Qwen][4])

Even more interesting for local users: just days later, Qwen also released the first **open-weight** Qwen 3.6 variant, **Qwen3.6-35B-A3B**, describing it as a sparse MoE model optimized for **stability** and **real-world utility**, with an emphasis on a more productive coding experience shaped by community feedback. ([Hugging Face][5])

That makes Qwen 3.6 particularly relevant for this stack for several reasons:

### ✅ Why Qwen 3.6 is a strong match

* It is being positioned for **agentic workflows**, not just plain chat. ([Qwen][4])
* It has a strong emphasis on **coding and terminal-style execution tasks**. ([Qwen][4])
* There is now an **open-weight 35B-A3B** release, making local deployment much more realistic than relying only on a hosted flagship model. ([Hugging Face][5])
* LM Studio already advertises support for the **Qwen3** family among local models. ([LM Studio][3])

For people building a private coding stack, this is a big deal: you can combine **Paperclip orchestration**, **OpenCode execution**, **LM Studio serving**, and **Qwen 3.6 reasoning** into a system that is much closer to a real AI operations environment than a normal chatbot setup.

## 🧩 Why this combo is exciting

What makes this setup stand out is not any single component by itself, but the way the pieces complement each other:

### 1. Paperclip adds structure 📋

Without orchestration, multiple AI tools quickly become chaos. Paperclip adds hierarchy, tasks, budgets, and traceability. ([GitHub][1])

### 2. OpenCode adds practical coding muscle 🛠�

OpenCode is not merely a static chat UI. It is a terminal-centric coding agent with explicit CLI workflows and backend attach/serve options. ([OpenCode][6])

### 3. LM Studio adds local control 🔒

LM Studio provides the model serving layer on your own machine, with OpenAI-compatible access and headless/server deployment options. ([LM Studio][3])

### 4. Qwen 3.6 adds a more agent-oriented brain 🧠

Qwen 3.6 is explicitly being presented as stronger at agentic coding, tool use, and execution-oriented tasks than earlier generations. ([Qwen][4])

## 🛠� Example usage scenario

A very realistic setup could look like this:

* **Paperclip CEO** receives the high-level business or development goal
* **Paperclip CTO / engineer agents** use **OpenCode CLI**
* OpenCode connects to **LM Studio**
* LM Studio runs **Qwen 3.6** locally
* Paperclip tracks tasks, approvals, budgets, and audit logs

This gives you a workflow where the AI is no longer just "answering questions," but instead:

* planning work,
* assigning work,
* executing coding tasks,
* reviewing outputs,
* and maintaining organizational context over time.

That is exactly the kind of setup many advanced users have wanted for local AI for quite a while.

## ⚠️ What to keep in mind

This stack is powerful, but it is not magic.

* **Paperclip** adds orchestration, but that also means more moving parts. ([GitHub][7])
* **OpenCode** is flexible, but agent workflows still depend heavily on good model behavior and solid tool integration. ([OpenCode][6])
* **LM Studio** makes local inference easy compared to raw server tooling, but you still need enough hardware for the model size you choose. ([LM Studio][3])
* **Qwen 3.6-Plus** is a hosted flagship-class model, while **Qwen3.6-35B-A3B** is the newly released open-weight option better suited to local deployment; these are not the same thing, so users should choose based on hardware and goals. ([Qwen][4])

## ✅ Bottom line

For anyone interested in **local AI agents that do real work**, this is one of the most compelling current combinations:

**Paperclip AI** gives you the org chart and control plane.
**OpenCode CLI** gives you the coding agent runtime.
**LM Studio** gives you the private local model server.
**Qwen 3.6** gives you a modern agent-oriented reasoning model.

Put together, this creates a serious foundation for **local-first autonomous coding teams** rather than a single isolated assistant. 🚀

If the ecosystem continues to mature, this kind of stack could become a very attractive alternative to cloud-only agent workflows — especially for developers, researchers, and small teams who want **privacy, cost control, and direct ownership of their AI infrastructure**. 🔥

[1]: https://github.com/paperclipai/paperclip "GitHub - paperclipai/paperclip: Open-source orchestration for zero-human companies · GitHub"
[2]: https://opencode.ai/?utm_source=chatgpt.com "OpenCode | The open source AI coding agent"
[3]: https://lmstudio.ai/ "LM Studio - Local AI on your computer"
[4]: https://qwen.ai/blog?id=qwen3.6&utm_source=chatgpt.com "Qwen3.6-Plus: Towards Real World Agents"
[5]: https://huggingface.co/Qwen/Qwen3.6-35B-A3B?utm_source=chatgpt.com "Qwen/Qwen3.6-35B-A3B"
[6]: https://opencode.ai/docs/cli/ "CLI | OpenCode"
[7]: https://github.com/paperclipai "Paperclip · GitHub"

Theo Gottwald

#1
To really use Paperclip with the Sindbyte MCP-Server, you need to download the newest version.
And use "Open Code" and connect Open Code NOT to MCP-Server but to the "OpenAI compatible Endpoint" that is also built into the SindByte MCP-Server.
Doing so, "Open Code" can directly use all MCP-Server Tools and can use LM-Studio - the loaded Model.
And this way Open Code - and therefore "Paperclip" can organize a "Trading Company" for example with KRAKEN,
where "virtual Employee's" can do (Paper-Trading is supported") trade with Paper- or real money and try to gain a win.

You could tell the "virtual CEO" to employ diffrent sorts of Traders, haven them do Paper Trading with their system
and if they do not win a lot to fire them.

And those that do a good Job, let them trade with your real account.

Technically this is possible, you need the newest version of "SindByte" which is 1.9.05 for that.
It has the Endpoint-compatibility with Open-Code.

Using the new Qwen 3.6 as local Model, Open Code can do a lot using LM-Studio. And its alltogether FREE.


IMPORTAN ADD: To use the Endpoint with Tools you need the newest version that will allow to use LM-Studio with an "Local API-Key".

Stan Duraham

#2
It looks like Qwen 3.6, using LM Studio, is really good with FreeBASIC. It took a wrong turn on me but I did get it back on track. You have to be specific. I have 32GB ram and 16GB vram, just enough to get by. I may need more ram for a bigger Context window. It won't speed it up but will allow long sessions for corrections and improvements.
If it's good with FreeBASIC then it must be great for C++ and more traveled paths.
Google AI mode in the browser is also really good with FreeBASIC.

Added Note: AI has a tendency to use outdated code because that's what most of the code on the web is. You have to point it in the right direction.
  •  

Theo Gottwald

#3
@Stan Duraham Qwen is good with Powerbasic also, of course all of them are best with C.

You can make it with 16 GB also,
you need to put some MOE weights onto RAM. Use this:

2026-04-20 08_08_32-Greenshot.png

It will definitely also run with 16 GB VRAM in a acceptable speed with the 35B version.

I am just testing "SpecKit" this may be helpful in larger Projects.




Stan Duraham

The local models have more limitations and are probably best used on the most trodden paths. I bit the bullet and laid down 400 smacks to upgrade to 64GB. That gives me the max context window. The context length is the amount of information that stays live. If you run out of context, AI stops. Now I can keep the same project open as long as I want and keep building on it or drop a file and start from there. It uses over 50GB.

I tried building an editor in C and C++ using GCC on Windows. Bad idea because that's going through one compiler that has to translate code written for another compiler. Lots of flags, typedef and macros. It does find mistakes and makes corrections. Better to use MS compiler for Windows code.

PowerBASIC might do pretty good because it hasn't changed in a long time.

If I tried to write a Windows application with FreeBASIC using its library, I'd probably have the same problem. FreeBASIC -> GCC -> MS compiled code.

I tried building an editor in FreeBASIC using José Roca's library. I gave it too much and didn't have enough context memory. Qwen 3.6 really liked José's code.

I suspect that Qwen 3.6 might do well on José's library with enough context memory because it's a complete wrapper. The only problem there is me. You're expecting a lot of local models to write code on a library you don't have a good grasp of. But the combination might be a good way to learn it.

I'm going to play around with it. There's a lot there, just have to figure out how to use it to your advantage.
  •  

Theo Gottwald

@Stan Duraham If you do not want to buy a Kimi-Coding Plan (its very helpful and keeps costs accountable) then i suggest you install "Open Code CLI" and use SpecKit. And of course use Qwen 3.6 maybe the Q4 if possible or try a little smaller. Having enough RAM is crucial.


🚀 Good alternative if you don't want to buy a Kimi Coding Plan

If you do not want to buy a Kimi Coding Plan — even though it can be very helpful for coding workflows — then I strongly suggest this setup instead: 👇

1) Install OpenCode CLI 💻
OpenCode is a very interesting coding agent for terminal-based workflows and a strong option if you want a more flexible setup.

Official links:
OpenCode](https://opencode.ai/]OpenCode) Website
OpenCode](https://opencode.ai/docs/]OpenCode) Docs
OpenCode](https://opencode.ai/download/]OpenCode) Download
OpenCode](https://github.com/opencode-ai/opencode]OpenCode) GitHub

2) Use Spec Kit / Spec-Driven Development 🧠
Instead of jumping straight into code, use Spec Kit to structure the work properly:
requirements ➜ plan ➜ tasks ➜ implementation.
That often gives much cleaner and more controllable results, especially for bigger coding projects.

Official links:
Spec](https://github.com/github/spec-kit]Spec) Kit GitHub
Spec](https://github.github.com/spec-kit/]Spec) Kit Docs
Spec](https://github.github.com/spec-kit/installation.html]Spec) Kit Installation Guide
Spec](https://github.com/github/spec-kit/releases]Spec) Kit Releases

3) For the model, try Qwen3.6 ⚙️
I would absolutely test Qwen3.6.
If possible, try a Q4 / 4-bit quant. If that is too heavy for your machine, go a bit smaller.

Official Qwen links:
Qwen3.6](https://github.com/QwenLM/Qwen3.6]Qwen3.6) GitHub
Qwen3.6](https://huggingface.co/Qwen/Qwen3.6-35B-A3B]Qwen3.6) Official Hugging Face Model
Qwen3.6](https://qwen.ai/blog?id=qwen3.6-35b-a3b]Qwen3.6) Release Blog
Qwen](https://chat.qwen.ai/]Qwen) Studio

Useful GGUF / local-model related link:
LM](https://huggingface.co/lmstudio-community/Qwen3.6-35B-A3B-GGUF]LM) Studio Community GGUF for Qwen3.6

4) Very important: RAM matters a lot 🧩
Having enough RAM / VRAM is crucial.
A local coding model may look great on paper, but if your machine is too tight on memory, the experience quickly becomes frustrating:
slow responses, swapping, unstable generation, and poor usability.

So my advice is:

Best path without Kimi:
• OpenCode CLI
• Spec Kit
• Qwen3.6
• preferably a Q4 quant if your hardware can handle it
• otherwise choose a slightly smaller model or smaller quant

5) If you still want to look at Kimi 📌
For anyone who still wants to compare first:

Kimi](https://www.kimi.com/code/docs/en/]Kimi) Code Docs
Kimi](https://www.kimi.com/membership/pricing]Kimi) Pricing

My personal recommendation: 🔥
If you want a setup that is powerful, structured, and cost-conscious, then:
OpenCode CLI + Spec Kit + Qwen3.6 is absolutely worth trying.

It gives you:
• a solid coding interface
• a much better workflow structure
• more control over how you work
• and, with the right hardware, a very strong local or semi-local coding setup

🚀 In short:
No Kimi Coding Plan?
Then install OpenCode CLI, use Spec Kit, and run Qwen3.6 — ideally in Q4 if your RAM allows it.

If your RAM is limited, go a little smaller — but definitely keep the same workflow idea. 👍

PS: I verified the current official pages for **OpenCode**, **Spec Kit**, **Kimi Code**, and **Qwen3.6** before drafting this. OpenCode is currently offered as a terminal tool, desktop app, and IDE extension; Spec Kit's maintainers explicitly recommend installing it from the official GitHub repo; and Qwen3.6 is an official current model line with official weights on Hugging Face and ModelScope. ([OpenCode][1])

[1]: https://opencode.ai/docs/ "Intro | AI coding agent built for the terminal"


Theo Gottwald

#6
I got it it finally running.

2026-04-20 22_15_01-Org Chart · Paperclip — Originalprofil — Mozilla Firefox.png

The trick is:
2026-04-20 16_11_09-Greenshot.png

Which needs a new Sindbyte 01.exe V. 1.9.10  or higher that will be available after further fixes in the next days.

Making Open Code use the MCP-Server while beeing connected to Paperclip is not easy because it would need to use an API-Key for LM-Studio, else LM-Studio will NOT allow usage of its MCP-Servers.

But Paperclip does not want to do that.

So what we do is, that Sindbyte Server will ""Inject"" the API-Key for LM-Studio transparently.
And this way I got my Trading Company running.

They got $50 to waste lets see what they can make out of it. One thig is sure - they will do it better then me.

Note: I did not personally set up the Paperclip, i had Kimi-Code do that. Its enough just to found the company manually, and go to the CEO, generate there an API.Key and hand that over together with the Company ID-Number to Kimi-Code. He can then use the API and manage the company.

2026-04-20 10_13_03-Greenshot.png


Zlatko Vid

QuoteI tried building an editor in C and C++ using GCC on Windows

I do same thing using Z.ai and in C
and Z.ai make it .. ;D
  •  

Stan Duraham

Thanks for information. I'm totally new at this but starting to get the picture. I have I5, no vram, 64GB ram.

qwen3.6-35b-a3b appears to be very good, however, its training date ended in 2024. It can't be used for FreeBASIC code.

For C and probably C++, qwen/qwen3-coder-next local model is faster and appears to be pretty good with C.

Using qwen/qwen3-coder-next , I asked it, "Using C compiler to be compiled with GCC on windows, please build a resizable Windows application with a browser in it supporting IWebBrowser, not IWebBrowser2."
It immediately started spitting out code. Two errors. Sent them back one at a time and then it compiled and worked.
However, it told me that GCC can have a problem with COM. So, I'm going to try it again with MS compiler.

But that's powerful to build working COM code using C. I realize heavy metal AI would be better.
  •