Run Your Privacy-Respecting AI on Gentoo Linux

Would you like to have ChatGPT summarize YouTube videos for you, but without OpenAI knowing the lewd content that you’re watching, or paying them an arm and a leg for it? You can, and it’s actually fairly straightforward.

Run Your Privacy-Respecting AI on Gentoo Linux

In this brief write-down, we will be looking at setting up our own “AI” (a.k.a. LLM, a.k.a. Large Language Model) that we can easily query from the command line. We’re going to do all that on a Gentoo Linux system, and we’re going to use AMD graphics, meaning we’re going to install ROCm.

I’m assuming you already have a running Gentoo system, either on a server or on your desktop machine. I’m also assuming that you have more-or-less recent AMD hardware and ideally a powerful GPU, that you already set up. However, the setup will work on a laptop with integrated graphics (iGPU), however, it might be that the “AI” won’t be able to utilize your GPU due to VRAM constraints. This is not a deal-breaker, but it’ll result in queries becoming a lot slower.

We’re using Ollama to run our LLM model(s). Back in March, the project announced support for AMD graphics hardware on Linux, and we’re going to try to make use of that.

Before we install Ollama, we need to set up the dependencies. First, unmask the following packages:

dev-libs/rocm-opencl-runtime ~amd64
dev-libs/rocr-runtime ~amd64
dev-libs/rocm-comgr ~amd64
dev-libs/rocm-device-libs ~amd64
dev-build/rocm-cmake ~amd64
dev-libs/roct-thunk-interface
dev-util/rocminfo ~amd64
dev-util/rocm-smi ~amd64
dev-util/hip ~amd64
dev-util/hipcc ~amd64
dev-perl/URI-Encode ~amd64
sci-libs/clblast ~amd64
dev-python/pipx ~amd64
dev-python/userpath ~amd64

Next, emerge them:

emerge -a \
  dev-lang/go \
  dev-libs/rocm-opencl-runtime \
  dev-libs/rocr-runtime \
  dev-libs/rocm-comgr \
  dev-libs/rocm-device-libs \
  dev-build/rocm-cmake \
  dev-libs/roct-thunk-interface
  dev-util/rocminfo \
  dev-util/rocm-smi \
  dev-util/hip \
  dev-util/hipcc \
  sci-libs/clblast \
  dev-python/pipx

It might be that you might need to unmask a few more dependencies.

While packages are building and installing, we can go ahead and clone the Ollama repository:

git clone --recurse-submodules --depth 1 \
  https://github.com/ollama/ollama

As soon as the dependencies were emerged, we can go ahead and build Ollama:

cd ollama/
go generate ./...
go build .

Depending on your machine this might take a while. Also, it might be that go generate aborts and you might need to install additional dependencies.

When go build finishes successfully we’ll end up with a binary called ollama in that same directory. We can go ahead and move the binary to a place that’s within our PATH environment variable (echo $PATH). For me, that’s ~/.go/bin/. Now we can launch Ollama by typing ollama serve into a new terminal window.

Next, we need to install a model. We can see all available models in Ollama’s model library. I’m going to go with llama3:latest, which is the 8B version of llama3 that runs on at least 8GB of RAM. You can go with llama3:70b if you have at least 64GB of RAM, though. The higher the parameter count (8 billion, 70 billion), the higher the capacity to learn and process complex patterns.

To install the model, execute ollama run llama3:latest in another shell. Remember, we started ollama serve and it should be up and running for this to work. You can go back to the terminal in which ollama serve is running and follow the model installation process.

Okay, we set up ROCm, we installed Ollama, we launched it and we installed the llama3 model. What now?

We need a convenient way to query it. We could work directly with the Ollama API to ask llama3 something, e.g.:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt":"Why is the sky blue?"
}'

However, this is relatively inconvenient, especially if we will need to formulate longer prompts for more complex tasks. A more convenient way to query Ollama (and other LLMs) is through Fabric.

Fabric is an open-source framework that helps people collect and integrate prompts – which the project refers to as Patterns – and makes them easily usable across a variety of real-world tasks. So instead of manually typing prompts for specific objectives, with Fabric you can use a pre-existing pattern to for example extract a summary out of a YouTube video or a podcast, write an essay in your own voice with just an idea as an input, summarize academic papers, and a lot more.

Right now, Fabric is built on Python, but they’re working on migrating it to Go, in order to simplify installation. However, with Python being a first-class citizen on Gentoo, we’re able to get Fabric up and running in no time.

We’ve already installed pipx, which we’re going to use to install Fabric. For that we need to first clone the Fabric repository:

git clone --recurse-submodules --depth 1 \
  https://github.com/danielmiessler/fabric.git

Then we change into the cloned repository and install Fabric using pipx:

pipx install .

This will take some time. Go grab a cup of coffee. When installation is completed, we can open a new terminal window and run the Fabric setup:

fabric --setup

The setup will ask about OpenAPI URLs and tokens, which we can simply skip. What we should add, however, is a YouTube API key, in order to make use of the yt command that comes with Fabric. If you don’t have an API key, follow the official Google guide to get one.

After entering the API key and finalizing the setup, you should end up with a file at ~/.config/fabric/.env, which contains that API key. Open the file, add a new line at the end of it, and save it:

DEFAULT_MODEL="llama3:latest"

We’re now able to query our Ollama llama3 model through Fabric.

Usage examples

Using the aforementioned yt command, for example, we can pull a video transcript from a YouTube video and ask Fabric to use llama3 to extract a summary of the content in the video:

yt --transcript https://www.youtube.com/watch\?v\=eeD3wS4fGEY \
| fabric --stream --pattern extract_wisdom

Depending on the performance of your hardware and whether Ollama is able to use the GPU or not – it should tell that during the launch of ollama serve – this can take a few seconds or a couple of minutes. However, even with the example video here, in which the transcription is generated by YouTube, the result is quite impressive:

Here are the extracted insights:

**SUMMARY**

Louis, discussing the principle of Wu-way, which is based on knowledge of the 
tide and drift of things. He emphasizes the importance of not forcing anything 
that we do.

**IDEAS**

* Way means not forcing in anything that you do.
* The principle of Wu-way is based on knowledge of the tide and drift of things.
* One should use muscle only at the right moment, like in Judo.
* Superior virtue has no intention to be virtuous.
* Inferior will tries to be Wu-way, but it isn't.
* It's not a matter of cultivated passivity or spontaneity.
* You have to become again as a child to see the space that is the foundation of 
  the universe.

**INSIGHTS**

* The importance of quieting the mind and letting go of thoughts and opinions.
* The value of simplicity and being open-minded, rather than forcing one's
  perspective.
* The need to let go of habit energy and compulsive thinking in order to see 
  things as they truly are.

**PURPOSE**

The purpose is to explore the concept of Wu-way and its implications for our 
daily lives. It emphasizes the importance of not forcing anything that we do, 
and instead, allowing things to unfold naturally.

The local LLM can also be used to help with system monitoring:

cat /var/log/messages | tail -n 20 | fabric --pattern summarize
# ONE SENTENCE SUMMARY:

The system logs a series of events, including network interface carrier on/off, 
segfaults in libpipewire-module-protocol-native.so, and kernel re-mounting of 
file systems.

# MAIN POINTS:

1. Network interface enp3s0f4u2u3 has carrier off/on issues.
2. Segfaults occur in libpipewire-module-protocol-native.so on multiple
   occasions.
3. File system mount points are re-mounted with quota mode set to none.
4. NOHZ tick-stop error and NMI watchdog are triggered at various times.

# TAKEAWAYS:

1. The network interface is experiencing connectivity issues.
2. A critical library module is causing segfaults, potentially indicating a
   software bug or incompatibility issue.
3. File system re-mounting may be related to quota management or other
   configuration changes.
4. NOHZ tick-stop error and NMI watchdog events suggest potential hardware or
   kernel-level issues requiring investigation.

Another system admin related use case is reading manuals. We can ask Llama to explain us a tool, based on its manual:

man aircrack-ng | fabric --pattern explain_docs
It looks like you're asking me to summarize the identity and purpose of  
aircrack-ng, as well as provide information on its various options and features.

Here's what I've gathered:

**Identity:** aircrack-ng is a popular open-source tool for auditing wireless 
networks. It provides a comprehensive set of tools to test the security of 
802.11-based networks.

**Purpose:** The primary goal of aircrack-ng is to crack WEP and WPA-PSK keys by 
analyzing captured packets and performing various types of attacks.

**Options:**

1. **-n**: Specify the length of the key (64-bit, 128-bit, etc.)
2. **-i**: Only keep IVs with a specific key index
3. **-f**: Adjust the fudge factor to affect the brute-force level
4. **-k**: Disable KoreK attacks
5. **-x**: Control last keybyte bruteforcing (default: enabled)
6. **-X**: Disable multithreading for SMP systems
7. **-s**: Show ASCII version of the key
8. **-y**: Perform a single brute-force attack
9. **-z**: Use PTW (Andrei Pyshkin, Erik Tews, and Ralf-Philipp Weinmann)
           attacks
10. **-P**: Disable klein in PTW debug mode or use PTW
11. **-K**: Use KoreK attacks instead of PTW

**Additional features:**

1. **WEP and WPA-PSK cracking options**: Specify a dictionary file for WPA 
cracking, create a new session, restore a previous session, generate an 
Elcomsoft Wireless Security Auditor (EWSA) project file v3.02, or create a 
Hashcat v3.6+ Capture file (HCCAPX).
2. **SIMD selection**: Force another optimization choice based on the CPU and 
available instructions.

**Other options:**

1. **-H or --help**: Show help screen
2. **-u or --cpu-detect**: Provide information on the number of CPUs and SIMD 
support

That's a lot to take in! Do you have any specific questions about aircrack-ng or 
would you like me to clarify any of these options?

Developers might find it useful to get an overview of changes. We can ask the LLM for a summary of commits that happened between two tags:

cd neonmodem/
git log v1.0.0..v1.0.3 | fabric --pattern summarize_git_changes
It seems that you are an expert project manager and developer, and you 
specialize in creating to 0.5.0.

The commit history shows a series of commits made by the same author, マリウス 
(Marius), over a period of time. The commits appear to be related to 
implementing features for various platforms, such as Hacker News, Lemmy, and 
Discourse.

Some notable changes include:

* Bumps golang.org/x/image from 0.3.0 to 0.5.0.
* Implementing reply chunks querying for #10.
* Fixed config loading issue.
* Implemented dedicated posts loading, part of #10.

It looks like the project is being actively maintained and updated with new 
features and bug fixes.

If we’re unsure about terms of services or an end user license agreement, we can have the “AI” generate a brief summary:

reader -o \
https://www.microsoft.com/en-us/UseTerms/Retail/Windows/11/UseTerms_Retail_Windows_11_English.htm\
| fabric --pattern check_agreement
This is the end-user license agreement (EULA) for a Microsoft software product, such as Windows or Office. The EULA outlines the rights and responsibilities of the user and the manufacturer (Microsoft) regarding the use and ownership of the software.

Some key points from this section include:

* The device manufacturer or installer, or Microsoft is liable only for slight negligence if they breach material contractual obligations that facilitate the due performance of this agreement.
* In other cases of slight negligence, the device manufacturer or installer, or Microsoft will not be liable.
* The user is responsible for understanding and complying with the terms of their own plans and agreements regarding networks, data, and Internet usage.
* The software may include H.264/MPEG-4 AVC and VC-1 decoding technology, which is licensed under specific patents and standards.
* Malware protection will be turned on if other antimalware software is not installed or has expired.
* Limited rights versions of the software are available for specific uses, such as academic or evaluation use.

Overall, this section outlines the legal terms and conditions under which the user may use the software, including limitations on liability, responsibility for compliance with third-party agreements, and restrictions on the use of certain features or technologies.

Hint: I’m using reader to fetch web content and automagically output it as markdown, so that I can pipe it directly into Fabric.

With data breaches happening on a daily basis, we can have the LLM generate brief digests of individual incidents. Let’s find out about the recent Dropbox hack:

reader -o \
https://sign.dropbox.com/blog/a-recent-security-incident-involving-dropbox-sign\
| fabric --pattern analyze_incident
Here is the extracted information:

**Attack Date:** April 24th (no specific year mentioned)

**Summary:** Unauthorized access to Dropbox Sign's production environment, 
exposing customer information.

**Key Details:**

* **Attack Type:** Compromised service account
* **Vulnerable Component:** Dropbox Sign automated system configuration tool
* **Attacker Information:**
	+ Name/Organization: Not specified
	+ Country of Origin: Not specified
* **Target Information:**
	+ Name: Dropbox Sign customers
	+ Country: USA (no specific impact mentioned)
  + Size: Large enterprise (not explicitly stated, but implied by the mention of 
    API keys and authentication)

**Customer FAQ:**

* **I'm a Sign customer - what has Dropbox done to protect me and what do I need 
  to do?**
	+ Expired password
	+ Logged out of any devices connected to Dropbox Sign
	+ Reset password as soon as possible
  + Rotate API key by generating a new one, configuring it with your 
    application, and deleting your current one (for API customers)
* **If I have a Sign account linked to my Dropbox account, is my Dropbox account 
  affected?**
	+ No, the incident was isolated to Dropbox Sign infrastructure
  + However, if you reused your Dropbox Sign password on any other services, 
    change your password and use multi-factor authentication when available (for 
    both Dropbox and Dropbox Sign accounts)
* **I'm a Sign API customer. Was my customers' data exposed as well?**
  + Names and email addresses for those who received or signed a document 
    through Dropbox Sign but never created an account were exposed
* **Where can I go for more information on this incident?**
  + Reach out to HelloSign's support team at [FAQ 
    link](https://faq.hellosign.com/hc/en-us/requests/new))

**Investigation Status:**

* Ongoing

Note: The exact year of the attack is not specified in the article.

Last but not least, we can have the LLM automatically create user stories from GitHub issues for our aGiLe PM:

cd neonmodem/
gh issue view 8 --json title,body --template '{{.title}}: {{.body}}' \
| head -n 1 \
| fabric --pattern agility_story
Here is a user story and acceptance criteria for the requested topic in JSON format:

{
    "Topic": "Reply Notifications",
    "Story": "As an online community member, I want to be notified when someone replies to my post or comment so that I can stay engaged with the conversation.",
    "Criteria": "Given that I have posted a message and am viewing the same thread on the system,
                When someone else posts a reply to my original message, then I should receive a notification in the UI.
                Given that I have commented on a post and am viewing the same topic page on the system,
                When someone else comments on the original post, then I should receive a notification in the UI."
}

Let me know if you need anything else!

With Ollama and Fabric we now have a powerful, free-of-charge, local LLM available to us, which we can easily query without the need to formulate lengthy and complex prompts. And there is so much more that Fabric allows us to do. Check the projects patterns folder to see what else it is capable of.


Enjoyed this? Support me via Monero, Bitcoin, Lightning, or Ethereum!  More info.