<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>blog.matterxiaomi.com</title>
  <id>http://blog.matterxiaomi.com/</id>
  <subtitle>This site is all about blog.matterxiaomi.com.</subtitle>
  <generator uri="https://github.com/madskristensen/Miniblog.Core" version="1.0">Miniblog.Core</generator>
  <updated>2026-05-03T19:13:00Z</updated>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/ecovacs-part8/</id>
    <title>Ecovacs in Home Assistant Part8 - Create a complete Home Assistant integration for the Ecovacs X5 Pro(skills)</title>
    <updated>2026-05-12T19:59:00Z</updated>
    <published>2026-05-03T19:13:00Z</published>
    <link href="http://blog.matterxiaomi.com/blog/ecovacs-part8/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="vacuum" />
    <content type="html">&lt;p&gt;Control Ecovacs Deebot robot vacuums via the Ecovacs Open Platform AK and a gateway (/robot/skill/*).&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;I'll analyze these &lt;a href="%20https://github.com/mslycn/vacumm-ecovacs-deebot/blob/main/references/api.md#cloudctl-clean"&gt; https://github.com/mslycn/vacumm-ecovacs-deebot/blob/main/references/api.md#cloudctl-clean&lt;/a&gt; to understand the Ecovacs API.I've created a complete Home Assistant integration for the Ecovacs Deetbot X5 Pro.&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1joerp2gt8"&gt;Repository Structure&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1joerp2gt9"&gt;Control commands&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="mcetoc_1joerp2gt8"&gt;Repository Structure&lt;/h2&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;custom_components/ecovacs_x5pro/
├── __init__.py (setup integration)
├── manifest.json (metadata)
├── const.py (constants)
├── config_flow.py (configuration UI)
├── api.py (API client)
└── vacuum.py (vacuum entity)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;api.py&amp;nbsp; &amp;nbsp; GetWorkState&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;config_flow.py&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;img src="/Posts/files/integration-x5pro-skills-1_639134334924450239.jpg" alt="integration-x5pro-skills-1.jpg" width="577" height="329" /&gt;&lt;/p&gt;
&lt;p&gt;api.py&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;the actual API returns:&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'h', 'chargeSt': 'charging', 'stationSt': 'dust'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'wash', 'chargeSt': 'charging', 'stationSt': 'i'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'washpause', 'chargeSt': 'charging', 'stationSt': 'i'}}}}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Your API response has one extra level of nesting that wasn't accounted for.So you need to go one level deeper:&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;ctl = (
    data.get("data", {})      # First data level
        .get("data", {})      # &amp;larr; SECOND data level (was missing!)
        .get("ctl", {})       # Then ctl
        .get("data", {})      # Then final data
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;vacuum.py&lt;/p&gt;
&lt;p&gt;当前 vacuum.py 推荐结构（tree）&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;vacuum.py
│
├── imports
│
├── logger
│
├── SCAN_INTERVAL
│
├── 状态映射
│   └── STATION_MAP
│
├── class EcovacsX5Vacuum(StateVacuumEntity)
│   │
│   ├── __init__
│   │
│   ├── device_info
│   │
│   ├── should_poll
│   │
│   ├── async_update
│   │
│   ├── state
│   │
│   ├── extra_state_attributes
│   │
│   ├── async_start
│   │
│   ├── async_pause
│   │
│   ├── async_return_to_base
│   │
│   └── _send
│
└── async_setup_entry&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;more detail&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;vacuum.py
│
├── from homeassistant.components.vacuum import ...
├── from datetime import timedelta
├── import aiohttp
├── import logging
│
├── _LOGGER
│
├── SCAN_INTERVAL
│
├── STATION_MAP
│   ├── i &amp;rarr; idle
│   ├── w &amp;rarr; washing
│   ├── d &amp;rarr; drying
│   ├── e &amp;rarr; emptying
│   ├── g &amp;rarr; going_charging
│   ├── c &amp;rarr; charging
│   └── p &amp;rarr; paused
│
├── class EcovacsX5Vacuum
│   │
│   ├── __init__(api)
│   │   ├── 保存 api
│   │   ├── 设置 entity 名称
│   │   ├── 设置 unique_id
│   │   ├── 初始化 state
│   │   └── 设置支持功能
│   │
│   ├── device_info
│   │   └── 注册设备到 HA
│   │
│   ├── should_poll
│   │   └── 告诉 HA 需要轮询
│   │
│   ├── async_update
│   │   ├── 调 API
│   │   ├── 获取状态
│   │   ├── 更新 self._state
│   │   └── 更新 self._station
│   │
│   ├── state
│   │   └── 返回 vacuum 状态
│   │
│   ├── extra_state_attributes
│   │   └── 返回 station_state
│   │
│   ├── async_start
│   │   └── Clean s
│   │
│   ├── async_pause
│   │   └── Clean p
│   │
│   ├── async_return_to_base
│   │   └── Charge go
│   │
│   └── _send
│       ├── POST 控制命令
│       ├── 刷新状态
│       └── 更新 UI
│
└── async_setup_entry
    ├── 创建 API
    └── 注册实体&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;vacuum.py&lt;/p&gt;
&lt;p&gt;必须提供 device_info（否则没有设备卡片）&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;@property
def device_info(self):
    return {
        "identifiers": {("ecovacs_x5pro", self.api.name)},
        "name": "Ecovacs X5 Pro",
        "manufacturer": "Ecovacs",
        "model": "X5 Pro",
    }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;实体必须有 unique_id（否则不会注册）&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;self._attr_unique_id = f"ecovacs_x5pro_{api.name}"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;async_add_entities 必须执行，实体创建&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;async_add_entities([EcovacsX5Vacuum(api)], update_before_add=True)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;vacuum.py - async_setup_entry(hass, entry, async_add_entities)&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;# ---------------- setup ----------------

async def async_setup_entry(hass, entry, async_add_entities):
    """Set up vacuum entity from config entry."""


    async_add_entities(
        [EcovacsX5Vacuum(api)],
        update_before_add=True
    )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;__init__.py -&amp;nbsp;async_setup_entry 必须正确加载平台&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;await hass.config_entries.async_forward_entry_setups(
    entry,
    ["vacuum", "sensor"]   # 👈 你有哪些就写哪些
)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;vacuum.py&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;class EcovacsX5Vacuum(StateVacuumEntity):
    """Ecovacs X5 Pro vacuum entity."""

    def __init__(self, api):

        self._station = "idle"

    @property
    def device_info(self):
        """Return device information."""
        return {
            "identifiers": {("ecovacs_x5pro", self.api.name)},
            "name": "Ecovacs X5 Pro",
            "manufacturer": "Ecovacs",
            "model": "X5 Pro",
        }
    ....

    async def async_update(self):
        """Update the vacuum state."""
    ...
            # Update  State
            if clean_code == "wash":
                self._state = "washing"
    ...


# ---------------- setup ----------------

async def async_setup_entry(hass, entry, async_add_entities):
    """Set up vacuum entity from config entry."""

    from .api import EcovacsAPI

    api = EcovacsAPI(
        entry.data["ak"],
        entry.data["nickname"]
    )

   #  4.3 实体创建
    async_add_entities(
        [EcovacsX5Vacuum(api)],
        update_before_add=True
    )&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;img src="/Posts/files/integration-x5pro-skills-2_639136861175022048.jpg" alt="integration-x5pro-skills-2.jpg" width="1066" height="348" /&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;   @property
   def icon(self):
       return "mdi:robot-vacuum"    # 加 icon，让 UI 更像官方
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1joerp2gt9"&gt;Control commands&lt;/h2&gt;
&lt;p&gt;vacuum.py&lt;/p&gt;
&lt;p&gt;run ok&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt; async def async_start(self):
        """Start the vacuum."""
        await self._send("Clean", {"act": "s"})

    async def async_pause(self):
        """Pause the vacuum."""
        await self._send("Clean", {"act": "p"})

    async def async_return_to_base(self):
        """Return the vacuum to base."""
        await self._send("Charge", {"act": "go"})

    async def _send(self, cmd, data):
        payload = {
            "ak": self.api.ak,
            "nickName": self.api.name,
            "ctl": {
                "cmd": cmd,
                "data": data
            }
        }&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/ecovacs-part7/</id>
    <title>Ecovacs in Home Assistant Part7 - Robot Vacuum Control skills</title>
    <updated>2026-05-11T15:45:15Z</updated>
    <published>2026-05-03T16:36:37Z</published>
    <link href="http://blog.matterxiaomi.com/blog/ecovacs-part7/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="vacuum" />
    <content type="html">&lt;p&gt;For state queries, you can use the script status, or call GetWorkState via POST /robot/skill/ctl.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Now we Call the Ecovacs Deebot gateway api via HTTP&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jnp4ngvmc"&gt;Device list&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jnp4ngvmd"&gt;Get area list&amp;nbsp;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jnp4ngvme"&gt;GetWorkState&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jnp52jp61"&gt;stationState when deebot docked&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jnp4ngvmc"&gt;Device list&lt;/h2&gt;
&lt;p&gt;curl -sS "https://open.ecovacs.com/robot/skill/deviceList?ak=your&amp;nbsp;Access Key"&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt;curl -sS "https://open.ecovacs.cn/robot/skill/deviceList?ak=FOcadaKSxfWGsZ65bsDeGlXjc4bASLko" | python -m json.tool
{
    "msg": "OK",
    "code": 0,
    "data": [
        {
            "name": "E0**",
            "nick": "DEEBOTX5PRO"
        },
        {
            "name": "E0**",
            "nick": null
        }
    ]
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;curl -sS -X POST "https://open.ecovacs.cn/robot/skill/ctl" -H 'Content-Type: application/json' \&lt;/p&gt;
&lt;p&gt;&amp;nbsp;-d &amp;ldquo;{\"ak\":\"FOcadaKSxfWGsZ65bsDeGlXjc4bASLko\",\"nickName\":\"DEEBOTX5PRO\",\"ctl\":{\"cmd\":\"GetAreaList\",\"data\":{}}}"&amp;nbsp;&amp;nbsp;| python -m json.tool&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jnp4ngvmd"&gt;Get area list&amp;nbsp;&lt;/h2&gt;
&lt;p&gt;curl -sS -X POST "${BASE_URL}/robot/skill/ctl" -H 'Content-Type: application/json' \&lt;/p&gt;
&lt;p&gt;&amp;nbsp; -d "{\"ak\":\"${AK}\",\"nickName\":\"device nick or name fragment\",\"ctl\":{\"cmd\":\"GetAreaList\",\"data\":{}}}"&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt;curl -sS -X POST "https://open.ecovacs.cn/robot/skill/ctl" \
  -H "Content-Type: application/json" \
  -d '{
    "ak": "FOcadaKSxfWGsZ65bsDeGlXjc4bASLko",
    "nickName": "DEEBOTX5PRO",
    "ctl": {
      "cmd": "GetAreaList",
      "data": {}
    }
  }' | python -m json.tool&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt; curl -sS -X POST "https://open.ecovacs.cn/robot/skill/ctl" \
  -H "Content-Type: application/json" \
  -d '{
    "ak": "FOcadaKSxfWGsZ65bsDeGlXjc4bASLko",
    "nickName": "DEEBOTX5PRO",
    "ctl": {
      "cmd": "GetAreaList",
      "data": {}
    }
  }' | python -m json.tool
{
    "msg": "OK",
    "code": 0,
    "data": {
        "code": 0,
        "msg": "success",
        "data": {
            "ctl": {
                "data": {
                    "ret": "ok",
                    "list": [
                        {
                            "subType": "6",
                            "mssid": "2",
                            "name": "\u5ba2\u5385\u536b\u751f\u95f4"
                        },
                        {
                            "subType": "6",
                            "mssid": "3",
                            "name": "\u4e3b\u5367\u536b\u751f\u95f4"
                        },
                        {
                            "subType": "13",
                            "mssid": "4",
                            "name": "\u9633\u53f0"
                        },
                        {
                            "subType": "5",
                            "mssid": "5",
                            "name": "\u53a8\u623f"
                        },
                        {
                            "subType": "0",
                            "mssid": "6",
                            "name": "\u5ba2\u4eba\u623f"
                        },
                        {
                            "subType": "10",
                            "mssid": "7",
                            "name": "\u513f\u7ae5\u623f"
                        },
                        {
                            "subType": "3",
                            "mssid": "8",
                            "name": "\u5367\u5ba4"
                        },
                        {
                            "subType": "1",
                            "mssid": "9",
                            "name": "\u5ba2\u5385"
                        },
                        {
                            "subType": "4",
                            "mssid": "10",
                            "name": "\u4e66\u623f"
                        }
                    ]
                }
            }
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;subtype: 0 unspecified, 1 living room, 2 dining room, 3 bedroom, 4 study, 5 kitchen, 6 bathroom&lt;/p&gt;
&lt;p&gt;see:&lt;span style="color: #1f2328; font-family: 'Mona Sans VF', -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Noto Sans', Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji'; font-size: 16px;"&gt;&amp;nbsp;&lt;/span&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;https://github.com/mslycn/vacumm-ecovacs-deebot/blob/main/references/api.md#getarealist-response-list-appendix-e&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jnp4ngvme"&gt;GetWorkState&lt;/h2&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;curl -X POST "https://open.ecovacs.cn/robot/skill/ctl" \&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;-H "Content-Type: application/json" \&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;-d '{&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;&amp;nbsp; "ak": "你的AK",&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;&amp;nbsp; "nickName": "你的设备",&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;&amp;nbsp; "ctl": {&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;&amp;nbsp; &amp;nbsp; "cmd": "GetWorkState"&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;&amp;nbsp; }&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;span style="color: #1f2328; font-family: Mona Sans VF, -apple-system, BlinkMacSystemFont, Segoe UI, Noto Sans, Helvetica, Arial, sans-serif, Apple Color Emoji, Segoe UI Emoji;"&gt;&lt;span style="font-size: 16px;"&gt;}'&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;curl -X POST "https://open.ecovacs.cn/robot/skill/ctl" \
-H "Content-Type: application/json" \
-d '{
  "ak": "FOcadaKSxfWGsZ65bsDeGlXjc4bASLko",
  "nickName": "DEEBOTX5PRO",
  "ctl": {
    "cmd": "GetWorkState"
  }
}' | python -m json.tool&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;{
    "msg": "OK",
    "code": 0,
    "data": {
        "code": 0,
        "msg": "success",
        "data": {
            "ctl": {
                "data": {
                    "ret": "ok",
                    "cleanSt": "h",
                    "chargeSt": "charging",
                    "stationSt": "i"
                }
            }
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 id="mcetoc_1jnp52jp61"&gt;stationState when deebot docked&lt;/h3&gt;
&lt;p&gt;idle&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;"ret": "ok",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "cleanSt": "h",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "chargeSt": "charging",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "stationSt": "i"&lt;/p&gt;
&lt;p&gt;dust&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "data": {&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "ret": "ok",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "cleanSt": "h",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "chargeSt": "charging",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "stationSt": "dust"&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;dust via ecovacs home app&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'h', 'chargeSt': 'charging', 'stationSt': 'i'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'h', 'chargeSt': 'charging', 'stationSt': 'dustpause'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'h', 'chargeSt': 'charging', 'stationSt': 'dust'}}}}}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;deebot auto return to dust&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;Ecovacs data from vacuum.py: {'cleanSt': 'h', 'chargeSt': 'charging', 'stationSt': 'dust'}
Ecovacs data from vacuum.py: {'cleanSt': 's', 'chargeSt': 'i', 'stationSt': 'i'}
Ecovacs data from vacuum.py: {'cleanSt': 'h', 'chargeSt': 'g', 'stationSt': 'i'}
Ecovacs data from vacuum.py: {'cleanSt': 'p', 'chargeSt': 'charging', 'stationSt': 'i'}
Ecovacs data from vacuum.py: {'cleanSt': 'h', 'chargeSt': 'charging', 'stationSt': 'dustpause'}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;wash&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "cleanSt": "wash",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "chargeSt": "charging",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "stationSt": "i"&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'h', 'chargeSt': 'charging', 'stationSt': 'dust'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'wash', 'chargeSt': 'charging', 'stationSt': 'i'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'washpause', 'chargeSt': 'charging', 'stationSt': 'i'}}}}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;dry&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "data": {&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "ret": "ok",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "cleanSt": "h",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "chargeSt": "charging",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "stationSt": "dry"&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Cleaning only vacumm&lt;/p&gt;
&lt;p&gt;{&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; "msg": "OK",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; "code": 0,&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; "data": {&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "code": 0,&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "msg": "success",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "data": {&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "ctl": {&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "data": {&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "ret": "ok",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "cleanSt": "s",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "chargeSt": "i",&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; "stationSt": "i"&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; }&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; }&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; }&lt;/p&gt;
&lt;p&gt;&amp;nbsp; &amp;nbsp; }&lt;/p&gt;
&lt;p&gt;}&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;error code&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'fail', 'errno': 4200, 'msg': '{"ret":"fail","errno":4200,"error":"endpoint offline","debug":"jmq.clusterNode.FetchClientInfo rsp!=null; clientinfo is in redis, but last endpoint ping and sync to redis time is 1778503965012, "}'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'fail', 'errno': 4200, 'msg': '{"ret":"fail","errno":4200,"error":"endpoint offline","debug":"jmq.clusterNode.FetchClientInfo rsp!=null; clientinfo is in redis, but last endpoint ping and sync to redis time is 1778503965013, "}'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'fail', 'errno': 4200, 'msg': '{"ret":"fail","errno":4200,"error":"endpoint offline","debug":"jmq.clusterNode.FetchClientInfo rsp!=null; clientinfo is in redis, but last endpoint ping and sync to redis time is 1778503965014, "}'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'fail', 'errno': 4200, 'msg': '{"ret":"fail","errno":4200,"error":"endpoint offline","debug":"jmq.clusterNode.FetchClientInfo rsp!=null; clientinfo is in redis, but last endpoint ping and sync to redis time is 1778503965018, "}'}}}}}
JSON response from api.py: {'msg': 'OK', 'code': 0, 'data': {'code': 0, 'msg': 'success', 'data': {'ctl': {'data': {'ret': 'ok', 'cleanSt': 'h', 'chargeSt': 'charging', 'stationSt': 'i'}}}}}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;detail：&lt;/p&gt;
&lt;p&gt;CloudCtl: Clean&amp;nbsp;https://github.com/mslycn/vacumm-ecovacs-deebot/blob/main/references/api.md#cloudctl-clean&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/raspberry-pi-part10/</id>
    <title>How to upgrade Raspberry Pi OS</title>
    <updated>2026-04-29T20:47:03Z</updated>
    <published>2026-04-29T18:26:05Z</published>
    <link href="http://blog.matterxiaomi.com/blog/raspberry-pi-part10/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="raspberry pi os lite (64-bit)" />
    <category term="raspberry pi" />
    <content type="html">&lt;p&gt;This article is intended to look at the commands needed for the Raspberry Pi Upgradation to Latest Version.&lt;/p&gt;
&lt;p&gt;I have a RPi 5 running Raspberry Pi OS Lite (Bookworm), and I&amp;nbsp; upgrade it to the newest Raspberry OS (Debian 13 Trixie).&lt;/p&gt;
&lt;p&gt;Upgrade Raspberry Pi OS from Bookworm to Trixie.&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jndehs5h9"&gt;Check Raspberry OS Version (Current)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jndfb6ip1"&gt;Make Sure the System Is Up to Date&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jndfdofp1"&gt;Edit sources.list for Debian Trixie&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jndehs5ha"&gt;Update the Raspberry Pi OS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jndehs5hb"&gt;Verification - Display your Debian version&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jndehs5h9"&gt;Check Raspberry OS Version (Current)&lt;/h2&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;cat /etc/os-release&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jndfb6ip1"&gt;Make Sure the System Is Up to Date&lt;/h2&gt;
&lt;p&gt;Before upgrading, update and upgrade all existing packages,Ensure your current system is fully updated.Perform a full upgrade of your existing OS installation.&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;sudo apt update
sudo apt full-upgrade
sudo reboot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jndfdofp1"&gt;Edit sources.list for Debian Trixie&lt;/h2&gt;
&lt;p&gt;Change all references from bookworm to trixie.&lt;/p&gt;
&lt;p&gt;In this file, you only need to change bookworm&amp;nbsp;to&amp;nbsp;trixie.&lt;/p&gt;
&lt;p&gt;sudo nano /etc/apt/sources.list&lt;/p&gt;
&lt;p&gt;change&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;deb http://raspbian.raspberrypi.org/raspbian/ bookworm main contrib non-free rpi non-free-firmware&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;deb http://deb.debian.org/debian trixie main contrib non-free non-free-firmware
deb http://deb.debian.org/debian-security/ trixie-security main contrib non-free non-free-firmware
deb http://deb.debian.org/debian trixie-updates main contrib non-free non-free-firmware&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;sudo nano /etc/apt/sources.list.d/raspi.list&lt;/p&gt;
&lt;p&gt;change&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;deb http://archive.raspberrypi.com/debian/ bookworm main&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;to&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;deb http://archive.raspberrypi.com/debian/ trixie main&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jndehs5ha"&gt;Update the Raspberry Pi OS&lt;/h2&gt;
&lt;p&gt;Upgrade to Debian 13 (Trixie)&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;# Ensure that your system has the latest repository information.Refresh package index
sudo apt update -y
# This will update the Raspberry Pi to the version that is available for the device.
sudo apt full-upgrade -y
# Restart your Device
sudo reboot&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;Get:1 http://deb.debian.org/debian trixie InRelease [140 kB]
Get:2 http://deb.debian.org/debian-security trixie-security InRelease [43.4 kB]
...
Get:858 http://deb.debian.org/debian trixie/main arm64 zstd arm64 1.5.7+dfsg-1 [635 kB]                                                
Fetched 788 MB in 1min 17s (10.2 MB/s)                                                                                                 
Reading changelogs... Done

...

update-initramfs: Generating /boot/initrd.img-6.12.75+rpt-rpi-v8
'/boot/initrd.img-6.12.75+rpt-rpi-v8' -&amp;gt; '/boot/firmware/initramfs8'
update-initramfs: Generating /boot/initrd.img-6.12.75+rpt-rpi-2712
'/boot/initrd.img-6.12.75+rpt-rpi-2712' -&amp;gt; '/boot/firmware/initramfs_2712'
Processing triggers for libgdk-pixbuf-2.0-0:arm64 (2.42.12+dfsg-4+deb13u1) ...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After the upgrade completes, reboot your system.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jndehs5hb"&gt;Verification - Display your Debian version&lt;/h2&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt; cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 13 (trixie)"
NAME="Debian GNU/Linux"
VERSION_ID="13"
VERSION="13 (trixie)"
VERSION_CODENAME=trixie
DEBIAN_VERSION_FULL=13.4
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Useful links&lt;/p&gt;
&lt;p&gt;Upgrades from Debian 12 (bookworm)&lt;/p&gt;
&lt;p&gt;https://www.debian.org/releases/trixie/release-notes/upgrading.en.html#upgrading-full&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/run-local-LLM-server-part5/</id>
    <title>how to run locally llama.cpp for home assistant on rpi5</title>
    <updated>2026-05-09T17:05:26Z</updated>
    <published>2026-04-21T21:33:31Z</published>
    <link href="http://blog.matterxiaomi.com/blog/run-local-LLM-server-part5/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="ai" />
    <category term="llm" />
    <content type="html">&lt;p&gt;Large Language Models for Home Assistant.&lt;/p&gt;
&lt;p&gt;To run llama.cpp locally for Home Assistant, you must host a llama.cpp server that provides an API， that Home Assistant can communicate with via an API.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Home Assistant does not have a "llama.cpp" brand integration by default.&lt;/p&gt;
&lt;p&gt;connect Home Assistant to it using a compatible integration. such as&amp;nbsp;https://github.com/skye-harris/hass_local_openai_llm.&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmtliqsh1"&gt;Run llama-server&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmtllr513"&gt;Connect to Home Assistant&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmro8d8sf"&gt;Integration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmro8d8sg"&gt;Voice assistant&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Device: Raspberry Pi 5 (8GB)&lt;/p&gt;
&lt;p&gt;OS: Debian 12&lt;/p&gt;
&lt;p&gt;Runtime: Docker&lt;/p&gt;
&lt;p&gt;Inference Engine: &lt;span style="white-space: normal;"&gt;llama-server&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;Model: Gemma 4 E2B (GGUF, quantized)&lt;/p&gt;
&lt;p&gt;Home assistant Integration:&amp;nbsp;Local OpenAI LLM&lt;/p&gt;
&lt;h1 style="font-family: Roboto, Noto, sans-serif; -webkit-font-smoothing: antialiased; font-size: 32px; line-height: 40px; text-underline-position: from-font; text-decoration-skip-ink: none; margin: 0px; color: #141414; background-color: #fafafa;"&gt;&amp;nbsp;&lt;/h1&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;In the ghcr.io/ggml-org/llama.cpp repository, the images are split by purpose:&lt;/p&gt;
&lt;table data-path-to-node="3"&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;span data-path-to-node="3,0,0,0"&gt;Tag&lt;/span&gt;&lt;/th&gt;
&lt;th&gt;&lt;span data-path-to-node="3,0,1,0"&gt;Primary Contents&lt;/span&gt;&lt;/th&gt;
&lt;th&gt;&lt;span data-path-to-node="3,0,2,0"&gt;Best Use Case&lt;/span&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span data-path-to-node="3,1,0,0"&gt;&lt;strong data-path-to-node="3,1,0,0" data-index-in-node="0"&gt;&lt;code data-path-to-node="3,1,0,0" data-index-in-node="0"&gt;:light&lt;/code&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span data-path-to-node="3,1,1,0"&gt;&lt;code data-path-to-node="3,1,1,0" data-index-in-node="0"&gt;llama-cli&lt;/code&gt;, &lt;code data-path-to-node="3,1,1,0" data-index-in-node="11"&gt;llama-completion&lt;/code&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span data-path-to-node="3,1,2,0"&gt;&lt;strong data-path-to-node="3,1,2,0" data-index-in-node="0"&gt;Testing/CLI:&lt;/strong&gt; Best for running models in the terminal or one-off completions without overhead.&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span data-path-to-node="3,2,0,0"&gt;&lt;strong data-path-to-node="3,2,0,0" data-index-in-node="0"&gt;&lt;code data-path-to-node="3,2,0,0" data-index-in-node="0"&gt;:server&lt;/code&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span data-path-to-node="3,2,1,0"&gt;&lt;code data-path-to-node="3,2,1,0" data-index-in-node="0"&gt;llama-server&lt;/code&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span data-path-to-node="3,2,2,0"&gt;&lt;strong data-path-to-node="3,2,2,0" data-index-in-node="0"&gt;Production/API:&lt;/strong&gt; Ideal for your Home Assistant setup. It provides the OpenAI-compatible endpoint.&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;span data-path-to-node="3,3,0,0"&gt;&lt;strong data-path-to-node="3,3,0,0" data-index-in-node="0"&gt;&lt;code data-path-to-node="3,3,0,0" data-index-in-node="0"&gt;:full&lt;/code&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span data-path-to-node="3,3,1,0"&gt;CLI, Server, &lt;strong data-path-to-node="3,3,1,0" data-index-in-node="13"&gt;and&lt;/strong&gt; Python conversion/quantization tools.&lt;/span&gt;&lt;/td&gt;
&lt;td&gt;&lt;span data-path-to-node="3,3,2,0"&gt;&lt;strong data-path-to-node="3,3,2,0" data-index-in-node="0"&gt;Development:&lt;/strong&gt; Use this if you need to convert &lt;code data-path-to-node="3,3,2,0" data-index-in-node="45"&gt;.safetensors&lt;/code&gt; to &lt;code data-path-to-node="3,3,2,0" data-index-in-node="61"&gt;.gguf&lt;/code&gt; or quantize a model yourself.&lt;/span&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;:light: Contains only llama-cli and llama-completion.It does not contain the API server.&lt;/p&gt;
&lt;p&gt;:server: Contains only llama-server.It contain the API server not contain&amp;nbsp;llama-cli.A lightweight, OpenAI API compatible, HTTP server for serving LLMs.&lt;/p&gt;
&lt;p&gt;:full: Contains everything.&lt;/p&gt;
&lt;p&gt;You should use the :server tag (or better yet, the :server-arm64 tag since you are on a Raspberry Pi 5).&lt;/p&gt;
&lt;h2 id="mcetoc_1jmtliqsh1"&gt;Run llama-server&lt;/h2&gt;
&lt;p&gt;The llama-server executable acts as an OpenAI-compatible API that Home Assistant can use.&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt;docker run -it --rm \
  --name llama \
  -v /datadocker/llama-cpp/models:/models \
  -p 8091:8080 \
  ghcr.io/ggml-org/llama.cpp:server \
  -m /models/google_gemma-4-E2B-it-Q4_0.gguf \
  --host 0.0.0.0 \
  --port 8080 \
  --threads 4 \
  --jinja
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Parameter Description&lt;/p&gt;
&lt;p&gt;&amp;nbsp;--entrypoint /app/llama-cli: processes your input (or waits for one), and then exits. It does not listen for network requests on a port.&lt;/p&gt;
&lt;p&gt;--entrypoint /app/llama-server: You were using llama-cli, which is for one-off prompts in the terminal. The llama-server is required to handle API calls like curl.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;--host 0.0.0.0: Inside a Docker container, the server must listen on 0.0.0.0 to accept connections from your Raspberry Pi's IP or localhost.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;--port 8080: This tells the software inside the container to listen on port 8080 (which you mapped to 8091 on your host).&lt;/p&gt;
&lt;p&gt;--jinja:&amp;nbsp;support for OpenAI-style function calling.Tool calling must be enabled in&amp;nbsp; inference engine.&lt;a href="https://github.com/ggml-org/llama.cpp/blob/master/docs/function-calling.md"&gt;Detail&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt;...
srv          init: init: chat template, thinking = 1
main: model loaded
main: server is listening on http://0.0.0.0:8080
main: starting the main loop...
srv  update_slots: all slots are idle
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now llama.cpp = local LLM &amp;rarr; HTTP API server&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Test&lt;/p&gt;
&lt;p&gt;Once the server logs show "HTTP server listening", run your curl command. Make sure to include a JSON body, otherwise the server might reject the request:&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt;curl http://localhost:8091/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Hello Gemma!"}]
  }'&lt;/code&gt;&lt;/pre&gt;
&lt;h2&gt;&amp;nbsp;&lt;/h2&gt;
&lt;h2 id="mcetoc_1jmtllr513"&gt;Connect to Home Assistant&lt;/h2&gt;
&lt;h3 id="mcetoc_1jmro8d8sf"&gt;Integration -&amp;nbsp;Add Integration&lt;/h3&gt;
&lt;p&gt;Custom Integration - Local OpenAI LLM Integration&lt;/p&gt;
&lt;p&gt;https://github.com/skye-harris/hass_local_openai_llm&lt;/p&gt;
&lt;p&gt;Wyoming-LLM (The Bridge): A Home Assistant Integration that sits between HA and llama.cpp.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Custom Integration - Configure Integration&lt;/p&gt;
&lt;p&gt;Added server URL to the initial server configuration&lt;/p&gt;
&lt;p&gt;http://192.168.2.125:8091&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 id="mcetoc_1jmro8d8sg"&gt;Voice assistant&amp;nbsp;&lt;span style="font-size: 14px;"&gt;- Create&amp;nbsp; conversation agent&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;Add assistant&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/run-local-LLM-server-part3/</id>
    <title>How to Run AI Models Locally with llama.cpp on rpi5</title>
    <updated>2026-04-23T17:42:13Z</updated>
    <published>2026-04-18T22:52:25Z</published>
    <link href="http://blog.matterxiaomi.com/blog/run-local-LLM-server-part3/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="ai" />
    <category term="llm" />
    <content type="html">&lt;p&gt;Most people access generative AI tools like ChatGPT or Gemini through a web interface or API &amp;mdash; but what if you could run them locally?&lt;/p&gt;
&lt;p&gt;In this article, you&amp;rsquo;ll learn how to set up your own local generative AI using existing models such as llama.cpp.&lt;/p&gt;
&lt;p&gt;The final result will look like the GIF shown below (note, it&amp;rsquo;s hosted localhost)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmhd01hc1"&gt;Prerequisites&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmoqdr6vc"&gt;step 1. Pull llama.cpp (light)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmoqdr6vd"&gt;step 2.&amp;nbsp;Pick a model - Download Gemma (GGUF, quantized)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmoqdr6ve"&gt;step 3.&amp;nbsp; Docker run and load model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmoqfbh0g"&gt;step 4.&amp;nbsp;test&amp;nbsp;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="mcetoc_1jmhd01hc1"&gt;Prerequisites&lt;/h2&gt;
&lt;p&gt;Hardware: Raspberry Pi 5 (8GB RAM highly recommended).&lt;/p&gt;
&lt;p&gt;OS: Raspberry Pi OS (64-bit) or Ubuntu (64-bit).&lt;/p&gt;
&lt;p&gt;Storage: At least 5GB free space (preferably on an SSD/NVMe for speed).&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Device: Raspberry Pi 5 (8GB)&lt;/p&gt;
&lt;p&gt;OS: Debian 12&lt;/p&gt;
&lt;p&gt;Runtime: Docker&lt;/p&gt;
&lt;p&gt;Engine: llama.cpp&lt;/p&gt;
&lt;p&gt;Model: Gemma 4 E2B (GGUF, quantized)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Install Docker (Debian 12)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Here are the commands I used to Got Gemma 4 E2B running on a Raspberry Pi 5 8GB:&lt;/p&gt;
&lt;h3 id="mcetoc_1jmoqdr6vc"&gt;step 1. Pull llama.cpp (light)&lt;/h3&gt;
&lt;p&gt;First of all, we need an LLM Serving Engine, such as llama.cpp.&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;# This is the correct lightweight image for Pi (ARM)
docker pull ghcr.io/ggml-org/llama.cpp:light&lt;/code&gt;&lt;/pre&gt;
&lt;h3 id="mcetoc_1jmoqdr6vd"&gt;step 2.&amp;nbsp;Pick a model - Download Gemma (GGUF, quantized)&lt;/h3&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;# llama.cpp only works with GGUF

# Create model directory
mkdir -p /datadocker/llama-cpp
cd /datadocker/llama-cpp/models


https://huggingface.co/unsloth/gemma-4-E2B-it-GGUF/tree/main

https://huggingface.co/bartowski/google_gemma-4-E2B-it-GGUF/tree/main

google_gemma-4-E2B-it-Q4_0.gguf
https://huggingface.co/bartowski/google_gemma-4-E2B-it-GGUF/resolve/main/google_gemma-4-E2B-it-Q4_0.gguf?download=true&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note&lt;/p&gt;
&lt;p&gt;There is no official &amp;ldquo;Gemma 4 E2B GGUF direct URL&amp;rdquo; from Google.&lt;/p&gt;
&lt;p&gt;GGUF files are community-converted and hosted on Hugging Face.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 id="mcetoc_1jmoqdr6ve"&gt;step 3.&amp;nbsp; Docker run and load model&lt;/h3&gt;
&lt;p&gt;Run llama.cpp server (Docker)&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;docker run -v /path/to/models:/models --entrypoint /app/llama-cli ghcr.io/ggml-org/llama.cpp:light -m /models/7B/ggml-model-q4_0.gguf&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;run ok&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;docker run -it --rm \
  -v /datadocker/llama-cpp/models:/models \
  --entrypoint /app/llama-cli \
  -p 8091:8080 \
  ghcr.io/ggml-org/llama.cpp:light \
  -m /models/gemma-4-e2b-it-Q4_K_M.gguf \&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Note&lt;/p&gt;
&lt;p&gt;1.pick the model you downloaded earlier.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;load_backend: loaded CPU backend from /app/libggml-cpu-armv8.2_2.so

Loading model...

▄▄ ▄▄
██ ██
██ ██  ▀▀█▄ ███▄███▄  ▀▀█▄    ▄████ ████▄ ████▄
██ ██ ▄█▀██ ██ ██ ██ ▄█▀██    ██    ██ ██ ██ ██
██ ██ ▀█▄██ ██ ██ ██ ▀█▄██ ██ ▀████ ████▀ ████▀
                                    ██    ██
                                    ▀▀    ▀▀

build      : b8833-45cac7ca7
model      : google_gemma-4-E2B-it-Q4_0.gguf
modalities : text

available commands:
  /exit or Ctrl+C     stop or exit
  /regen              regenerate the last response
  /clear              clear the chat history
  /read &amp;lt;file&amp;gt;        add a text file
  /glob &amp;lt;pattern&amp;gt;     add text files using globbing pattern


&amp;gt; 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 id="mcetoc_1jmoqfbh0g"&gt;step 4.&amp;nbsp;test&amp;nbsp;&lt;/h3&gt;
&lt;pre class="language-python"&gt;&lt;code&gt;&amp;gt; how

[Start thinking]
Thinking Process:

1.  **Analyze the Request:** The user provided a single word: "how". This is an extremely open-ended prompt. It implies a question, a request for a process, or a search for a method.
2.  **Determine the Context:** Since there is no context, I must provide a response that addresses the ambiguity while offering ways to get a useful answer.
3.  **Identify the Goal:** The goal is to prompt the user to specify what they want to know "how" to do.
4.  **Brainstorm Potential Interpretations of "how":**
    *   *How does X work?* (Mechanism, process)
    *   *How do I do Y?* (Instructions, tutorial)
    *   *How to achieve Z?* (Strategy, steps)
    *   *How are we connected?* (Relationship, system)
5.  **Formulate the Response Strategy:**
    *   Acknowledge the brevity.
    *   State that more information is needed.
    *   Provide examples of the types of questions I can answer.
    *   Invite the user to ask the specific question.
6.  **Draft the Response (Self-Correction/Refinement):** Start with a friendly, open invitation. Ensure the tone is helpful and encouraging. (The resulting response should be a clear call to action.)
[End thinking]

Please tell me what you would like to know **how** to do! 😊

I can help you with instructions, explanations, processes, recipes, coding, concepts, and much more.

**For example, you could ask:**

* "How do I bake a cake?"
* "How does photosynthesis work?"
* "How do I change the font in Microsoft Word?"
* "How do I start learning Spanish?"

**Just tell me your question!**

[ Prompt: 8.6 t/s | Generation: 5.6 t/s ]

&amp;gt; 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;access generative AI tools like llamacp through a web interface or API&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Useful links&lt;/p&gt;
&lt;p&gt;llama.cpp on GitHub with Docker&amp;nbsp;image&lt;/p&gt;
&lt;p&gt;https://github.com/ggml-org/llama.cpp/blob/master/docs/docker.md&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Models download via&amp;nbsp; url&lt;/p&gt;
&lt;p&gt;https://huggingface.co/bartowski/google_gemma-4-E2B-it-GGUF/tree/main&lt;/p&gt;
&lt;p&gt;https://huggingface.co/bartowski/google_gemma-4-E2B-it-GGUF/blob/main/google_gemma-4-E2B-it-Q4_0.gguf&lt;/p&gt;
&lt;p&gt;https://huggingface.co/ggml-org/gemma-4-E2B-it-GGUF&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/run-local-LLM-server-part2/</id>
    <title>How to Run AI Models Locally with Ollama</title>
    <updated>2026-04-21T20:42:51Z</updated>
    <published>2026-04-17T19:11:15Z</published>
    <link href="http://blog.matterxiaomi.com/blog/run-local-LLM-server-part2/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="ai" />
    <category term="llm" />
    <content type="html">&lt;p&gt;In this article, you&amp;rsquo;ll learn how to set up your own local generative AI using existing models such as Gemma 4 and Meta&amp;rsquo;s LLaMA 3.&lt;/p&gt;
&lt;p&gt;The official [Ollama Docker image](https://hub.docker.com/r/ollama/ollama) ollama/ollama is available on Docker Hub.&lt;/p&gt;
&lt;p&gt;To run the Gemma 4 model locally using Ollama:&lt;/p&gt;
&lt;p&gt;First of all, we need an LLM Serving Engine, such as Ollama: A framework for running large language models locally.&lt;/p&gt;
&lt;p&gt;Pull the LLM models via&amp;nbsp;Ollama&lt;/p&gt;
&lt;p&gt;Load the LLM models via&amp;nbsp;Ollama&lt;/p&gt;
&lt;p&gt;Test&amp;nbsp; the LLM models via&amp;nbsp;Ollama CLI&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmgfi3k41"&gt;Install Ollama with Docker&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmgfjbqc3"&gt;Pull and Run a Model via ollama&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmghk0ah1"&gt;Test&amp;nbsp; the LLM models via Ollama CLI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmgltup51"&gt;Configuration Checklist&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="mcetoc_1jmgfi3k41"&gt;Install Ollama with Docker&lt;/h2&gt;
&lt;p&gt;There are several ways to install it on your machine,we will&amp;nbsp;Running ollama&amp;nbsp;via Docker.&lt;/p&gt;
&lt;p&gt;Download&lt;/p&gt;
&lt;p&gt;docker pull&amp;nbsp; ollama/ollama:0.21.0&lt;/p&gt;
&lt;p&gt;&lt;img src="/Posts/files/ollama-1_639120498754095416.jpg" alt="ollama-1.jpg" width="789" height="342" /&gt;&lt;/p&gt;
&lt;p&gt;Version control:&lt;/p&gt;
&lt;p&gt;Google officially released the Gemma 4 family on April 2, 2026, and ollama latest version&amp;nbsp;has stabilized support for its unique architectures.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;## Run the Container&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;You need to decide if you want to use CPU-only or GPU acceleration and specified a specific version.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;CPU Only&lt;/p&gt;
&lt;p&gt;~~~&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;docker run -d \

  -v ollama:/root/.ollama \

  -p 11434:11434 \

  --name ollama \

  ollama/ollama:0.21.0&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;~~~&lt;/p&gt;
&lt;p&gt;The container starts an API server, but it doesn't come with any LLMs pre-installed.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jmgfjbqc3"&gt;Pull and Run a Model via ollama&lt;/h2&gt;
&lt;p&gt;You need to "exec" into the container to pull a model (like Llama 4 ).&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;# into the container
docker exec -it ollama sh

# Download a model without running it
ollama pull [model-name]

# Run a Model
# A single command (ollama run gemma4:e4b) handles downloading, memory management, and API serving.
ollama run gemma4:e4b&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Ollama will automatically download the model.llm server loading model.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Once the model is loaded, you can see generating output:&lt;/p&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;# ollama run gemma4:e2b
&amp;gt;&amp;gt;&amp;gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;docker logs -f ollama&lt;/p&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;time=2026-04-18T09:18:54.786Z level=INFO source=server.go:1398 msg="waiting for server to become available" status="llm server loading model"
...
time=2026-04-18T09:20:29.899Z level=INFO source=server.go:1402 msg="llama runner started in 97.04 seconds"
[GIN] 2026/04/18 - 09:20:32 | 200 |         1m42s |       127.0.0.1 | POST     "/api/generate"
[GIN] 2026/04/18 - 09:22:32 | 200 | 39.184533484s |       127.0.0.1 | POST     "/api/chat"
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jmghk0ah1"&gt;Test&amp;nbsp; the LLM models via Ollama CLI&lt;/h2&gt;
&lt;p&gt;start a chat -Test&amp;nbsp; the LLM models&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Verify - cli(Test in CLI)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;# ollama run gemma4:e2b
&amp;gt;&amp;gt;&amp;gt; how
Thinking...
Thinking Process:

1.  **Analyze the Input:** The input is "how".
...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Verify - api test&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [{
    "role": "user",
    "content": "Why is the sky blue?"
  }],
  "stream": false
}'
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;output&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;root@debian:~# curl http://localhost:11434/api/chat -d '{
  "model": "gemma4:e2b",
  "messages": [{
    "role": "user",
    "content": "Why is the sky blue?"
  }],
  "stream": false
}'
{"model":"gemma4:e2b","created_at":"2026-04-18T09:24:45.242321396Z","message":{"role":"assistant","content":"The reason the sky appears blue is due to a phenomenon called **Rayleigh Scattering**. It is a result of how sunlight interacts with the small molecules of the Earth's atmosphere.\n\nHere is a detailed breakdown of the process:\n\n---\n\n### 1. The Ingredients: Sunlight and Atmosphere\n\n**A. Sunlight is White Light:**\nSunlight, which appears white to us, is actu
...&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Verify - Web UI(Browser test)&lt;/p&gt;
&lt;p&gt;open-webui&lt;/p&gt;
&lt;p&gt;https://github.com/open-webui/open-webui&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;# ollama list
NAME             ID              SIZE      MODIFIED          
gemma4:e2b       7fbdbf8f5e45    7.2 GB    15 minutes ago       
gemma4:e4b       c6eb396dbd59    9.6 GB    About an hour ago    
gemma3:4b        a2af6cc3eb7f    3.3 GB    2 hours ago          
llama3:latest    365c0bd3c000    4.7 GB    3 hours ago &lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Quick Diagnostic Steps&lt;/p&gt;
&lt;p&gt;docker logs -f ollama&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jmgltup51"&gt;Configuration Checklist&lt;/h2&gt;
&lt;p&gt;OS: 64-bit Debian12 OS&lt;/p&gt;
&lt;p&gt;RAM:16G&lt;/p&gt;
&lt;p&gt;CPU:Intel Cpu 10400&lt;/p&gt;
&lt;p&gt;Larger Models:Gemma 4 E2B (4-bit)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Useful links&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Recommended Models of&amp;nbsp; ollama&lt;/p&gt;
&lt;p&gt;https://ollama.com/library&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;gemma4 Model information&lt;/p&gt;
&lt;p&gt;https://ollama.com/library/gemma4:e4b&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Gemma 4 Inference Memory Requirements&lt;/p&gt;
&lt;p&gt;https://ai.google.dev/gemma/docs/core#gemma-4-inference-memory-requirements&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameters&lt;/th&gt;
&lt;th&gt;BF16 (16-bit)&lt;/th&gt;
&lt;th&gt;SFP8 (8-bit)&lt;/th&gt;
&lt;th&gt;Q4_0 (4-bit)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 E2B&lt;/td&gt;
&lt;td&gt;9.6 GB&lt;/td&gt;
&lt;td&gt;4.6 GB&lt;/td&gt;
&lt;td&gt;3.2 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 E4B&lt;/td&gt;
&lt;td&gt;15 GB&lt;/td&gt;
&lt;td&gt;7.5 GB&lt;/td&gt;
&lt;td&gt;5 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 31B&lt;/td&gt;
&lt;td&gt;58.3 GB&lt;/td&gt;
&lt;td&gt;30.4 GB&lt;/td&gt;
&lt;td&gt;17.4 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemma 4 26B A4B&lt;/td&gt;
&lt;td&gt;48 GB&lt;/td&gt;
&lt;td&gt;25 GB&lt;/td&gt;
&lt;td&gt;15.6 GB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;model requires more system memory (9.8 GiB) than is available (4.7 GiB)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;blog&lt;/p&gt;
&lt;p&gt;https://medium.com/tech-ai-chat/running-llm-on-a-local-mac-machine-0dae23d8320b&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/run-local-LLM-server-part1/</id>
    <title>Ollama vs LiteLLM vs llmama.cpp vs vvllm vs lm studio</title>
    <updated>2026-04-19T18:27:39Z</updated>
    <published>2026-04-05T03:15:00Z</published>
    <link href="http://blog.matterxiaomi.com/blog/run-local-LLM-server-part1/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="ai" />
    <category term="llm" />
    <content type="html">&lt;p&gt;How to run a local LLM server step by step&lt;/p&gt;
&lt;p&gt;Ollama vs LiteLLM vs llmama.cpp vs vvllm vs lm studio&lt;/p&gt;
&lt;p&gt;These tools represent different layers of the AI stack. While they overlap, they generally serve distinct purposes:&lt;/p&gt;
&lt;p&gt;Serving (Llama.cpp, vLLM),&lt;/p&gt;
&lt;p&gt;Managing (Ollama, LM Studio),&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Routing (LiteLLM).&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmgik5tof"&gt;Managing (Ollama, LM Studio)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmgik5tog"&gt;Serving (Llama.cpp, vLLM)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jmgik5toh"&gt;Routing (LiteLLM)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jmgik5tof"&gt;Managing (Ollama, LM Studio)&lt;/h2&gt;
&lt;p&gt;Ollama&lt;/p&gt;
&lt;p&gt;A local LLM inference/runtime platform.It handles model downloads, storage, and execution with a simple CLI/API. Think of it as a &amp;ldquo;local LLM server&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;Run AI Models locally&amp;nbsp; integrate via API&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;LM Studio&lt;/p&gt;
&lt;p&gt;A&amp;nbsp; desktop application.&lt;/p&gt;
&lt;p&gt;Run AI Models locally with a Chat UI&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jmgik5tog"&gt;Serving (Llama.cpp, vLLM)&lt;/h2&gt;
&lt;p&gt;llama.cpp - run ai model on edge devices.&lt;/p&gt;
&lt;p&gt;Run a model on a Raspberry Pi.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;vLLM&lt;/p&gt;
&lt;p&gt;Build a high-traffic AI startup or production API.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jmgik5toh"&gt;Routing (LiteLLM)&lt;/h2&gt;
&lt;p&gt;LiteLLM&lt;/p&gt;
&lt;p&gt;LiteLLM is not an inference engine; it is a Proxy/Router.A proxy/gateway layer that provides a unified, OpenAI-compatible API for calling many LLM providers (cloud and local).&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/matter-bridge-part2/</id>
    <title>Matter Bridge in Home Assistant Part2 - Install MatterBridge Connect to Home assistant</title>
    <updated>2026-02-16T23:14:37Z</updated>
    <published>2026-02-15T00:02:19Z</published>
    <link href="http://blog.matterxiaomi.com/blog/matter-bridge-part2/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="matter bridge" />
    <category term="matter bridge" />
    <content type="html">&lt;p&gt;Matter Bridge in Home Assistant Part2 - Install MatterBridge Connect to Home assistant&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jhfa2nbb8"&gt;Quick start&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jhfadikr2"&gt;Install and configure&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jhfa2nbb9"&gt;How to Use&amp;nbsp;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;h2 id="mcetoc_1jhfa2nbb8"&gt;Quick start&lt;/h2&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;I set up the matterbridge as follows:&lt;/p&gt;
&lt;p&gt;Install the Matterbridge docker&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Create long-lived access tokens to allow home-assistant-matter-hub docker to interact with your Home Assistant instance.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Communication configure between Matterhub and Home Assistant,Matterbridge to connect to home assistant with url and token&lt;/p&gt;
&lt;p&gt;expose&amp;nbsp; homeassistant device as a matter bridge&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;open http://192.168.2.125:8482/ via chrome browser

Create a new bridge,

Add device "pattern: switch.air_con" in new bridge

start it to generate a pairing QR code

Connect accessory to Apple Home&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jhfadikr2"&gt;Install and configure&lt;/h2&gt;
&lt;p&gt;docker-compose.yml&lt;/p&gt;
&lt;p&gt;You need to create an access token in home assistant&amp;nbsp;instance&amp;nbsp;and export it like this:&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt;services:
  matter-hub:
    image: ghcr.io/t0bst4r/home-assistant-matter-hub:3.0.1
    restart: unless-stopped
    network_mode: host
    environment: # more options can be found in the configuration section
      - HAMH_HOME_ASSISTANT_URL=http://192.168.2.125:8123/
      - HAMH_HOME_ASSISTANT_ACCESS_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJhMzcwZDExYjM4MjE0YzFmYThmZTk3NDZjMDQyODU2NSIsImlhdCI6MTc3MTA4MTk1NSwiZXhwIjoyMDg2NDQxOTU1fQ.vhZD-KhJe4XIXd6_XvBE92y4T5W1aICSfBCbTTCvFL4
      - HAMH_LOG_LEVEL=info
      - HAMH_HTTP_PORT=8482
    volumes:
      - /datadocker/home-assistant-matter-hub:/data&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Now you can visit it via web ui&lt;/p&gt;
&lt;p&gt;http://192.168.2.125:8482/&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jhfa2nbb9"&gt;How to Use&amp;nbsp;&lt;/h2&gt;
&lt;p&gt;expose&amp;nbsp; homeassistant device as a matter bridge&lt;/p&gt;
&lt;p&gt;open http://192.168.2.125:8482/&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Create a new bridge for device,home-assistant-matter-hub docker get it&amp;nbsp;from home assistant via api.&lt;/p&gt;
&lt;p&gt;type:pattern&lt;/p&gt;
&lt;p&gt;value:light.yeelink_cn_ceiling21_s_2_light&lt;/p&gt;
&lt;pre class="language-python"&gt;&lt;code&gt;{
  "name": "matterbridgeceilling21v2",
  "port": 5543,
  "filter": {
    "include": [
      {
        "type": "pattern",
        "value": "light.yeelink_cn_476690814_ceiling21_s_2_light"
      }
    ],
    "exclude": []
  },
  "featureFlags": {
    "coverDoNotInvertPercentage": false,
    "includeHiddenEntities": false
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/ecovacs-part6/</id>
    <title>Ecovacs in Home Assistant Part6 -  Ecovacs Robot MCP Server</title>
    <updated>2026-05-12T18:59:34Z</updated>
    <published>2026-02-13T13:17:29Z</published>
    <link href="http://blog.matterxiaomi.com/blog/ecovacs-part6/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <category term="vacuum" />
    <content type="html">&lt;p&gt;Official Ecovacs Deebot MCP Server&lt;/p&gt;
&lt;p&gt;MCP protocol&lt;/p&gt;
&lt;p&gt;https://github.com/ecovacs-ai/ecovacs-mcp/blob/main/ecovacs_mcp/robot_mcp_stdio.py&lt;/p&gt;
&lt;p&gt;Created:2025.04.24&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Official Doc：&lt;/p&gt;
&lt;p&gt;https://open.ecovacs.com/#/serviceOverview&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jo24ubo57"&gt;way 1.custom integration&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jo24ubo58"&gt;way 2. mcp client integraton in ha&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jo24ubo57"&gt;way 1.custom integration&lt;/h2&gt;
&lt;p&gt;https://github.com/hoangminh1109/ecovacs_cn&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jo24ubo58"&gt;way 2. mcp client integraton in ha&lt;/h2&gt;
&lt;p&gt;mcp client integration&lt;/p&gt;
&lt;p&gt;https://www.home-assistant.io/integrations/mcp&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;mcp client integraton configure&lt;/p&gt;
&lt;p&gt;The remote MCP server URL for the SSE endpoint, for example http://example/mcp&lt;/p&gt;
&lt;p&gt;Ecovacs SSE Server URL:&lt;/p&gt;
&lt;p&gt;https://mcp-open.ecovacs.cn/sse?ak=your ak&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;useful links&lt;/p&gt;
&lt;p&gt;https://open.ecovacs.com/#/serviceOverview&lt;/p&gt;</content>
  </entry>
  <entry>
    <id>http://blog.matterxiaomi.com/blog/create-wyoming-server-home-assistant-part5/</id>
    <title>ModelScope vs Hugging Face vs k2-fsa.github.io vs Kaldi vs Sherpa</title>
    <updated>2026-04-30T17:02:08Z</updated>
    <published>2026-02-11T19:46:43Z</published>
    <link href="http://blog.matterxiaomi.com/blog/create-wyoming-server-home-assistant-part5/" />
    <author>
      <name>test@example.com</name>
      <email>blog.matterxiaomi.com</email>
    </author>
    <content type="html">&lt;p&gt;Hugging Face, ModelScope, and k2-fsa.github.io (specifically the k2-fsa/sherpa-onnx project) represent different approaches to the machine learning ecosystem。&lt;/p&gt;
&lt;p&gt;Hugging Face and ModelScope host everything. k2-fsa and Sherpa only do Speech.&lt;/p&gt;
&lt;p&gt;k2-fsa and Sherpa are highly specialized tools focused on speech recognition (ASR) and synthesis (TTS)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Sherpa (often referred to as sherpa-onnx or sherpa-ncnn) is a lightweight speech-to-text (ASR) and text-to-speech (TTS) engine.&amp;nbsp;Best for: Deploying speech models on edge devices (Android, iOS, WebAssembly, ARM boards) or high-performance servers, prioritizing low latency and CPU efficiency.&lt;/p&gt;
&lt;div class="mce-toc"&gt;
&lt;h2&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jhchklu53"&gt;Hugging Face（Global AI&amp;nbsp;model platform）&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jhchiqmm1"&gt;ModelScope(The Alibaba/Chinese AI Industrial model platform)&amp;nbsp;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jhchocht5"&gt;specialized tools focused on speech recognition (ASR) and synthesis (TTS)&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jh8e1hud2"&gt;k2-fsa (Next-Gen Kaldi)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="#mcetoc_1jhcht3fl7"&gt;Sherpa(The Real-Time Speech Deployment Tool)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;p&gt;These four entities represent two different categories: Model Ecosystems (Hugging Face &amp;amp; ModelScope) and Speech Recognition Frameworks (Kaldi &amp;amp; k2-fsa).&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jhchklu53"&gt;&lt;strong&gt;Hugging Face（&lt;/strong&gt;Global AI&amp;nbsp;model platform&lt;strong&gt;）&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;The Global Industry Standard。It supports NLP, computer vision, audio, and multimodal models via transformers and diffusers libraries.&lt;/p&gt;
&lt;p&gt;download models&lt;/p&gt;
&lt;p&gt;https://huggingface.co/models&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;https://huggingface.co/FunAudioLLM/SenseVoiceSmall&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;https://huggingface.co/funasr/paraformer-zh/blame/7904416f6cb6290ee7dc0b2ddb2993a9fe4f421a/README.md&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jhchiqmm1"&gt;&lt;strong&gt;ModelScope&lt;/strong&gt;&lt;strong&gt;(The Alibaba/Chinese AI Industrial model platform)&amp;nbsp;&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;ModelScope is an AI model hub led by Alibaba DAMO Academy. It provides pre-trained models, pipelines, and deployment tools, especially strong in Chinese language and speech technologies.ModelScope is often described as the "Chinese Hugging Face."&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;https://modelscope.cn/search?search=sherpa%20onnx&lt;/p&gt;
&lt;p&gt;Inference&amp;nbsp; Framework&lt;/p&gt;
&lt;p&gt;1.funasr&lt;/p&gt;
&lt;p&gt;2.funasr-onnx&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;see:https://github.com/modelscope/FunASR?tab=readme-ov-file#sensevoice&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Alibaba Speech-to-Text models tree&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;Alibaba Speech-to-Text models tree
├── FunASR（ASR 推理框架）
│
│   ├── FunASR 1.3
│   │   ├── Paraformer series
│   │   │   ├── paraformer-zh
│   │   │   ├── paraformer-large
│   │   │   ├── paraformer-streaming
│   │   │   └── paraformer-hotword
│   │
│   ├── FunASR 1.5
│   │   ├── Paraformer（升级版）
│   │   ├── FunASR-Nano series
│   │   │   ├── Nano Small
│   │   │   ├── Nano Streaming
│   │   │   └── Nano Int8（边缘部署）
│   │   │
│   │   ├── SenseVoice series
│   │   │   ├── SenseVoice Small
│   │   │   ├── SenseVoice Large
│   │   │   └── SenseVoice multilingual
│   │
│   └── Inference Framework
│       ├── PyTorch
│       ├── funasr-onnx
│       ├── Sherpa-ONNX
│       └── TensorRT
│
└── FunAudioLLM（语音大模型）
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;https://github.com/FunAudioLLM/SenseVoice&lt;/p&gt;
&lt;p&gt;https://github.com/modelscope/FunASR&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h2 id="mcetoc_1jhchocht5"&gt;specialized tools focused on speech recognition (ASR) and synthesis (TTS)&lt;/h2&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kaldi: Speech Toolkit&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The "grandfather" of modern speech recognition. It is a C++ based toolkit developed primarily by Dan Povey.&lt;/p&gt;
&lt;p&gt;demo&lt;/p&gt;
&lt;p&gt;KaldiRecognizer&lt;/p&gt;
&lt;p&gt;https://github.com/rhasspy/wyoming-faster-whisper/blob/main/wyoming_faster_whisper/__main__.py&lt;/p&gt;
&lt;h3 id="mcetoc_1jh8e1hud2"&gt;k2-fsa (Next-Gen Kaldi)&lt;/h3&gt;
&lt;p&gt;&amp;nbsp;Speech Toolkit.The Modern Successor&lt;/p&gt;
&lt;p&gt;What it is: Often called "Next-gen Kaldi." It is a complete rewrite of Kaldi&amp;rsquo;s core concepts to make them natively compatible with &lt;strong&gt;PyTorch&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Key Repositories:&lt;/p&gt;
&lt;p&gt;Icefall: Where the actual training recipes for speech models (like Zipformer) live.&lt;/p&gt;
&lt;p&gt;k2: The core library for differentiable FSTs.the old version of the tool you used before k2-fsa existed.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;download sherpa-onnx&lt;/p&gt;
&lt;p&gt;https://github.com/k2-fsa/sherpa-onnx/releases&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;download sherpa-onnx&amp;nbsp;asr models&lt;/p&gt;
&lt;p&gt;https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html&lt;/p&gt;
&lt;p&gt;https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h4 id="mcetoc_1jh9dcjgo1"&gt;&amp;nbsp;download&amp;nbsp;Silero VAD ONNX model&lt;/h4&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;# https://k2-fsa.github.io/sherpa/onnx/sense-voice/pretrained.html#sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17
wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;download url:https://k2-fsa.github.io/sherpa/onnx/vad/silero-vad.html#download-models-files&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;h3 id="mcetoc_1jhcht3fl7"&gt;&lt;strong&gt;Sherpa&lt;/strong&gt;&lt;strong&gt;(The Real-Time Speech Deployment Tool)&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;The deployment engine (CPU/GPU, Android, iOS, WebAssembly).It uses models trained in the k2-fsa ecosystem.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;How they all fit together&lt;/p&gt;
&lt;p&gt;1.k2-fsa is the tool you use to build a high-performance speech model.&lt;/p&gt;
&lt;p&gt;silero-vad model download&lt;/p&gt;
&lt;p&gt;https://k2-fsa.github.io/sherpa/onnx/vad/silero-vad.html#download-models-files&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Once you've trained that model using k2-fsa, you might upload it to Hugging Face or ModelScope so others can download it easily.&lt;/p&gt;
&lt;p&gt;Hugging Face hosts models from both ModelScope and k2-fsa/Sherpa, serving as a distribution point for them.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Layer Relationship&lt;/p&gt;
&lt;pre class="language-markup"&gt;&lt;code&gt;Model Hosting &amp;amp; Distribution
   ├── ModelScope
   └── Hugging Face

Inference / Runtime Framework
   └── k2-fsa / sherpa&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;pre class="language-csharp"&gt;&lt;code&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</content>
  </entry></feed>