Overseas access: www.kdjingpai.com
Bookmark Us
Current Position:fig. beginning " AI How-Tos

Deployment of a personalized set of "small" model chat tools for low-cost computers

2024-08-18 2.0 K

Why deploy a private "mini" model chat tool?

Many people have used ChatGPT, Wisdom Spectrum, Beanbag, Claude and other excellent large language models, and there is a need for in-depth use will also buy third-party paid services, after all, they are very outstanding performance. For example, my main working scenario is to write articles, then I will choose Claude.

Although I love using Claude, do I really use it on a daily basis with high frequency? The answer is of course no!

Thresholds such as limitations on usage, price factors, network issues, etc., naturally reduce the frequency of use when it is not necessary. --If a tool can't be "picked up and used" in any environment, it's not working.

In this case, using a "small" model may be a better choice, why?

低配电脑部署一套个人专属的

 

 

"Small" model characteristics

GEMMA2, LLAMA3.1:8b, QWEN2:7b are small enough for daily use, with 32k long contextual inputs and outputs, most of the commands are followed, the ability to express yourself in Chinese, answer questions, and overall it's all good, and it doesn't have the limitations of "Wenxin Yiyi" which is only able to input 2000 characters. There is no similar limitation of "Wenshin Yiyan" products... It is sufficient for daily use, and we will consider specializing in special tasks. The advantages of the miniatures are as follows:

  • Support for context sizes no smaller (or even larger) than the larger model
  • Everyday writing tasks with not-so-low-quality output
  • unlimited use
  • Multiple miniatures can output answers concurrently for easy comparison.
  • Faster execution

 

What is private deployment?

A private chat WEB interface for easy customization and free access to "small" models.

The most classic solution is to deploy Ollama+Open WebUI locally, the former is responsible for running the miniatures on the local computer and the latter hosts the chat interface. Considering extranet use anytime, anywhere, you can usecloudflaremaybecpolarMap the address to an external network (search for the tutorial yourself).

 

vantage

  • Chatting data is local and private
  • Flexibility to customize local models

drawbacks

  • Difficult to run persistently (you always have to turn off your computer, right?) Difficult to publish to an extranet
  • High computer hardware requirements

 

issue to be addressed

It's the shortcomings that we're trying to address:

1. The deployed AI chat interface needs to be published to the extranet and have a stable access URL to be used anytime and anywhere

2. Computer hardware threshold is mainly the use of Ollama to run the model locally, changed to well-known manufacturers of API services can be, privacy protection is relatively good and free. (General computer local can run up the small model, the network has a free API)

 

optimal program

1. Local/cloud free doceker deployment Open WebUI + access to "small" model APIs

Local use only, computer hardware only needs to be able to run doceker

2. Self-deployment/use of tri-party NextChat + access to "small" model APIs

Self-deployment of NextChat requires your own domain name, and there is a risk of compromising your keys by using a three-party NextChat.

 

This deployment program is only for experienced people to operate, inexperienced white is not recommended, good use of mature products, or encounter abnormal problems delay is not worth it.

 

Optimal deployment option 1

 

1. Deployment of doceker

Local: local deployment of doceker tutorials search for yourself

Cloud: free doceker resources in the cloud, please search for yourself, here I use Koyeb.. (Intranet not directly accessible, requires science and technology)

 

2. Deploying Open WebUI in doceker

Local: DetailsRead the documentThe following installation commands are recommended (keep it up to date)

docker run --rm --volume /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --run-once open-webui

 

Cloud: RegistrationKoyebAfter that, click Create Service and enter the following command

ghcr.io/open-webui/open-webui:main

低配电脑部署一套私人使用的

 

3. Start Open WebUI

Local startup, accessed by default at http://localhost:3000/

低配电脑部署一套私人使用的

 

Cloud startup, after Koyeb deployment is complete, can be clicked here (the disadvantage is that this domain name can not be directly accessed by the intranet, and binding the domain name requires the opening of a paid account)

低配电脑部署一套私人使用的

 

After startup, register an account, by default, the first registered account that is the administrator account. Already registered, so only the login interface, the first visit you can see the "registration" portal

低配电脑部署一套私人使用的

 

4. Apply for free "small" model APIs

OpenRouter is recommended, and has been writing novels for a year using its free models. Here's an explanation of how to get OpenRouter's modeling APIs

PS: Domestic free small model API vendors: Silicon Flow

 

4.1 Creating a KEY

低配电脑部署一套私人使用的

 

低配电脑部署一套私人使用的

You will get a string of characters starting with sk-, this is KEY, please copy it and save it locally, you can't copy it again after the page is closed.

 

4.2 Confirmation of the list of free models

低配电脑部署一套私人使用的

 

4.3 Getting the API request URL(math.) genus

Go to any model page to see it, generally: https://openrouter.ai/api/v1/chat/completions

低配电脑部署一套私人使用的

 

5. Enter the Open WebUI configuration model

Note that clicking on "4" confirms that the interface was accessed successfully before clicking on Save

低配电脑部署一套私人使用的

 

6. Configure the default model

Multiple free models can be selected

Use of paid models will result in account deactivation

低配电脑部署一套私人使用的

 

Click Presets to save frequently used models

低配电脑部署一套私人使用的

 

7. Try a first conversation

低配电脑部署一套私人使用的

 

 

Optimal deployment option 2

 

1.Cloud Deployment of NextChat

Free one-click cloud deployment, check out the help for yourself: https://github.com/ChatGPTNextWeb/ChatGPT-Next-Web

低配电脑部署一套私人使用的

 

2. The first Deploy (vercel) deployment is used here.

Just follow the process, and here are three things to keep in mind:

  • Be sure to read the help documentation carefully and follow the tutorial to set up your project to update automatically.
  • Configure the KEY variable and access password during the vercel installation process, it is recommended to configure them in advance.
  • Binding your own domain name allows direct access to domestic networks.

 

3. Configuration variables

It is not like Option 1 can automatically read the model list, you need to define your own free model list, note the change of interface address

BASE_URL or OpenAI Endpoint: Set this to https://openrouter.ai/api
OPENAI_API_KEY or OpenAI API Key: Enter your OpenRouter API key here.
CUSTOM_MODELS or Custom Models: Specify the model name as it is listed within OpenRouter.

 

4. Deployment completion screen

低配电脑部署一套私人使用的

 

5. Binding domain name

Resolving in-country access issues

 

低配电脑部署一套私人使用的

 

4. You can configure the API KEY for a model separately in the settings

低配电脑部署一套私人使用的

 

You can configure aOhMyGPTA small amount of free GPT4 credits per day, another address for stable access to API KEY (to prevent abusive hiding):

AI生产力应用This content has been hidden by the author, please enter the verification code to view the content
Captcha:
Please pay attention to this site WeChat public number, reply "CAPTCHA, a type of challenge-response test (computing)", get the verification code. Search in WeChat for "AI productivity applications"or"Artificial9527"or WeChat scanning the right side of the QR code can be concerned about this site WeChat public number.

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish